1
|
Zaman A, Bivona TG. Quantitative Framework for Bench-to-Bedside Cancer Research. Cancers (Basel) 2022; 14:5254. [PMID: 36358671 PMCID: PMC9658824 DOI: 10.3390/cancers14215254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 10/20/2022] [Accepted: 10/24/2022] [Indexed: 11/29/2022] Open
Abstract
Bioscience is an interdisciplinary venture. Driven by a quantum shift in the volume of high throughput data and in ready availability of data-intensive technologies, mathematical and quantitative approaches have become increasingly common in bioscience. For instance, a recent shift towards a quantitative description of cells and phenotypes, which is supplanting conventional qualitative descriptions, has generated immense promise and opportunities in the field of bench-to-bedside cancer OMICS, chemical biology and pharmacology. Nevertheless, like any burgeoning field, there remains a lack of shared and standardized framework for quantitative cancer research. Here, in the context of cancer, we present a basic framework and guidelines for bench-to-bedside quantitative research and therapy. We outline some of the basic concepts and their parallel use cases for chemical-protein interactions. Along with several recommendations for assay setup and conditions, we also catalog applications of these quantitative techniques in some of the most widespread discovery pipeline and analytical methods in the field. We believe adherence to these guidelines will improve experimental design, reduce variabilities and standardize quantitative datasets.
Collapse
Affiliation(s)
- Aubhishek Zaman
- Department of Medicine, University of California, San Francisco, CA 94158, USA
- UCSF Helen Diller Comprehensive Cancer Center, University of California, San Francisco, CA 94158, USA
| | - Trever G. Bivona
- Department of Medicine, University of California, San Francisco, CA 94158, USA
- UCSF Helen Diller Comprehensive Cancer Center, University of California, San Francisco, CA 94158, USA
- Chan-Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
2
|
Maghsoudi Z, Nguyen H, Tavakkoli A, Nguyen T. A comprehensive survey of the approaches for pathway analysis using multi-omics data integration. Brief Bioinform 2022; 23:6761962. [PMID: 36252928 PMCID: PMC9677478 DOI: 10.1093/bib/bbac435] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 08/26/2022] [Accepted: 09/08/2022] [Indexed: 02/07/2023] Open
Abstract
Pathway analysis has been widely used to detect pathways and functions associated with complex disease phenotypes. The proliferation of this approach is due to better interpretability of its results and its higher statistical power compared with the gene-level statistics. A plethora of pathway analysis methods that utilize multi-omics setup, rather than just transcriptomics or proteomics, have recently been developed to discover novel pathways and biomarkers. Since multi-omics gives multiple views into the same problem, different approaches are employed in aggregating these views into a comprehensive biological context. As a result, a variety of novel hypotheses regarding disease ideation and treatment targets can be formulated. In this article, we review 32 such pathway analysis methods developed for multi-omics and multi-cohort data. We discuss their availability and implementation, assumptions, supported omics types and databases, pathway analysis techniques and integration strategies. A comprehensive assessment of each method's practicality, and a thorough discussion of the strengths and drawbacks of each technique will be provided. The main objective of this survey is to provide a thorough examination of existing methods to assist potential users and researchers in selecting suitable tools for their data and analysis purposes, while highlighting outstanding challenges in the field that remain to be addressed for future development.
Collapse
Affiliation(s)
- Zeynab Maghsoudi
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557, Nevada, USA
| | - Ha Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557, Nevada, USA
| | - Alireza Tavakkoli
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557, Nevada, USA
| | - Tin Nguyen
- Corresponding author: Tin Nguyen, Department of Computer Science and Engineering, University of Nevada, Reno, NV, USA. Tel.: +1-775-784-6619;
| |
Collapse
|
3
|
Xie X, Kendzior MC, Ge X, Mainzer LS, Sinha S. VarSAn: associating pathways with a set of genomic variants using network analysis. Nucleic Acids Res 2021; 49:8471-8487. [PMID: 34313777 PMCID: PMC8421213 DOI: 10.1093/nar/gkab624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 05/18/2021] [Accepted: 07/20/2021] [Indexed: 02/01/2023] Open
Abstract
There is a pressing need today to mechanistically interpret sets of genomic variants associated with diseases. Here we present a tool called ‘VarSAn’ that uses a network analysis algorithm to identify pathways relevant to a given set of variants. VarSAn analyzes a configurable network whose nodes represent variants, genes and pathways, using a Random Walk with Restarts algorithm to rank pathways for relevance to the given variants, and reports P-values for pathway relevance. It treats non-coding and coding variants differently, properly accounts for the number of pathways impacted by each variant and identifies relevant pathways even if many variants do not directly impact genes of the pathway. We use VarSAn to identify pathways relevant to variants related to cancer and several other diseases, as well as drug response variation. We find VarSAn's pathway ranking to be complementary to the standard approach of enrichment tests on genes related to the query set. We adopt a novel benchmarking strategy to quantify its advantage over this baseline approach. Finally, we use VarSAn to discover key pathways, including the VEGFA-VEGFR2 pathway, related to de novo variants in patients of Hypoplastic Left Heart Syndrome, a rare and severe congenital heart defect.
Collapse
Affiliation(s)
- Xiaoman Xie
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Matthew C Kendzior
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Xiyu Ge
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Liudmila S Mainzer
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.,Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA.,Cancer Center of Illinois, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
4
|
Rian K, Hidalgo MR, Çubuk C, Falco MM, Loucera C, Esteban-Medina M, Alamo-Alvarez I, Peña-Chilet M, Dopazo J. Genome-scale mechanistic modeling of signaling pathways made easy: A bioconductor/cytoscape/web server framework for the analysis of omic data. Comput Struct Biotechnol J 2021; 19:2968-2978. [PMID: 34136096 PMCID: PMC8170118 DOI: 10.1016/j.csbj.2021.05.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 04/21/2021] [Accepted: 05/11/2021] [Indexed: 12/13/2022] Open
Abstract
Genome-scale mechanistic models of pathways are gaining importance for genomic data interpretation because they provide a natural link between genotype measurements (transcriptomics or genomics data) and the phenotype of the cell (its functional behavior). Moreover, mechanistic models can be used to predict the potential effect of interventions, including drug inhibitions. Here, we present the implementation of a mechanistic model of cell signaling for the interpretation of transcriptomic data as an R/Bioconductor package, a Cytoscape plugin and a web tool with enhanced functionality which includes building interpretable predictors, estimation of the effect of perturbations and assessment of the effect of mutations in complex scenarios.
Collapse
Affiliation(s)
- Kinza Rian
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Laboratory of Innovative Technologies (LTI), National School of Applied Sciences in Tangier, UAE, Morocco
| | - Marta R. Hidalgo
- Bioinformatics and Biostatistics Unit, Centro de Investigación Príncipe Felipe (CIPF), 46012 Valencia, Spain
| | - Cankut Çubuk
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
| | - Matias M. Falco
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Bioinformatics in RareDiseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Sevilla 41013, Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Computational Systems Medicine. Institute of Biomedicine of Seville (IBiS), Sevilla 41013, Spain
| | - Marina Esteban-Medina
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Computational Systems Medicine. Institute of Biomedicine of Seville (IBiS), Sevilla 41013, Spain
| | - Inmaculada Alamo-Alvarez
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Computational Systems Medicine. Institute of Biomedicine of Seville (IBiS), Sevilla 41013, Spain
| | - María Peña-Chilet
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Bioinformatics in RareDiseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Sevilla 41013, Spain
- Computational Systems Medicine. Institute of Biomedicine of Seville (IBiS), Sevilla 41013, Spain
| | - Joaquín Dopazo
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocío, Sevilla 41013, Spain
- Bioinformatics in RareDiseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Sevilla 41013, Spain
- Computational Systems Medicine. Institute of Biomedicine of Seville (IBiS), Sevilla 41013, Spain
- Functional Genomics Node (INB-ELIXIR-es), Sevilla, Spain
| |
Collapse
|
5
|
Savino A, Provero P, Poli V. Differential Co-Expression Analyses Allow the Identification of Critical Signalling Pathways Altered during Tumour Transformation and Progression. Int J Mol Sci 2020; 21:E9461. [PMID: 33322692 PMCID: PMC7764314 DOI: 10.3390/ijms21249461] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/02/2020] [Accepted: 12/09/2020] [Indexed: 02/02/2023] Open
Abstract
Biological systems respond to perturbations through the rewiring of molecular interactions, organised in gene regulatory networks (GRNs). Among these, the increasingly high availability of transcriptomic data makes gene co-expression networks the most exploited ones. Differential co-expression networks are useful tools to identify changes in response to an external perturbation, such as mutations predisposing to cancer development, and leading to changes in the activity of gene expression regulators or signalling. They can help explain the robustness of cancer cells to perturbations and identify promising candidates for targeted therapy, moreover providing higher specificity with respect to standard co-expression methods. Here, we comprehensively review the literature about the methods developed to assess differential co-expression and their applications to cancer biology. Via the comparison of normal and diseased conditions and of different tumour stages, studies based on these methods led to the definition of pathways involved in gene network reorganisation upon oncogenes' mutations and tumour progression, often converging on immune system signalling. A relevant implementation still lagging behind is the integration of different data types, which would greatly improve network interpretability. Most importantly, performance and predictivity evaluation of the large variety of mathematical models proposed would urgently require experimental validations and systematic comparisons. We believe that future work on differential gene co-expression networks, complemented with additional omics data and experimentally tested, will considerably improve our insights into the biology of tumours.
Collapse
Affiliation(s)
- Aurora Savino
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| | - Paolo Provero
- Department of Neurosciences “Rita Levi Montalcini”, University of Turin, Corso Massimo D’Ázeglio 52, 10126 Turin, Italy;
- Center for Omics Sciences, Ospedale San Raffaele IRCCS, Via Olgettina 60, 20132 Milan, Italy
| | - Valeria Poli
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| |
Collapse
|
6
|
Maleki F, Ovens K, Hogan DJ, Kusalik AJ. Gene Set Analysis: Challenges, Opportunities, and Future Research. Front Genet 2020; 11:654. [PMID: 32695141 PMCID: PMC7339292 DOI: 10.3389/fgene.2020.00654] [Citation(s) in RCA: 106] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 05/29/2020] [Indexed: 12/14/2022] Open
Abstract
Gene set analysis methods are widely used to provide insight into high-throughput gene expression data. There are many gene set analysis methods available. These methods rely on various assumptions and have different requirements, strengths and weaknesses. In this paper, we classify gene set analysis methods based on their components, describe the underlying requirements and assumptions for each class, and provide directions for future research in developing and evaluating gene set analysis methods.
Collapse
|
7
|
Zhao Y, Piekos S, Hoang TH, Shin DG. A framework using topological pathways for deeper analysis of transcriptome data. BMC Genomics 2020; 21:834. [PMID: 32138666 PMCID: PMC7057456 DOI: 10.1186/s12864-019-6155-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 09/30/2019] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Pathway analysis is one of the later stage data analysis steps essential in interpreting high-throughput gene expression data. We propose a set of algorithms which given gene expression data can recognize which portion of sub-pathways are actively utilized in the biological system being studied. The degree of activation is measured by conditional probability of the input expression data based on the Bayesian Network model constructed from the topological pathway. RESULTS We demonstrate the effectiveness of our pathway analysis method by conducting two case studies. The first one applies our method to a well-studied temporal microarray data set for the cell cycle using the KEGG Cell Cycle pathway. Our method closely reproduces the biological claims associated with the data sets, but unlike the original work ours can produce how pathway routes interact with each other above and beyond merely identifying which pathway routes are involved in the process. The second study applies the method to the p53 mutation microarray data to perform a comparative study. CONCLUSIONS We show that our method achieves comparable performance against all other pathway analysis systems included in this study in identifying p53 altered pathways. Our method could pave a new way of carrying out next generation pathway analysis.
Collapse
Affiliation(s)
- Yue Zhao
- Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, 06269 USA
| | - Stephanie Piekos
- Department of Pharmaceutical Sciences, University of Connecticut, 69 North Eagleville Road, Unit 3092, Storrs, USA
| | - Tham H. Hoang
- Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, 06269 USA
| | - Dong-Guk Shin
- Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, 06269 USA
| |
Collapse
|
8
|
Nguyen TM, Shafi A, Nguyen T, Draghici S. Identifying significantly impacted pathways: a comprehensive review and assessment. Genome Biol 2019; 20:203. [PMID: 31597578 PMCID: PMC6784345 DOI: 10.1186/s13059-019-1790-4] [Citation(s) in RCA: 96] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 08/13/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Many high-throughput experiments compare two phenotypes such as disease vs. healthy, with the goal of understanding the underlying biological phenomena characterizing the given phenotype. Because of the importance of this type of analysis, more than 70 pathway analysis methods have been proposed so far. These can be categorized into two main categories: non-topology-based (non-TB) and topology-based (TB). Although some review papers discuss this topic from different aspects, there is no systematic, large-scale assessment of such methods. Furthermore, the majority of the pathway analysis approaches rely on the assumption of uniformity of p values under the null hypothesis, which is often not true. RESULTS This article presents the most comprehensive comparative study on pathway analysis methods available to date. We compare the actual performance of 13 widely used pathway analysis methods in over 1085 analyses. These comparisons were performed using 2601 samples from 75 human disease data sets and 121 samples from 11 knockout mouse data sets. In addition, we investigate the extent to which each method is biased under the null hypothesis. Together, these data and results constitute a reliable benchmark against which future pathway analysis methods could and should be tested. CONCLUSION Overall, the result shows that no method is perfect. In general, TB methods appear to perform better than non-TB methods. This is somewhat expected since the TB methods take into consideration the structure of the pathway which is meant to describe the underlying phenomena. We also discover that most, if not all, listed approaches are biased and can produce skewed results under the null.
Collapse
Affiliation(s)
- Tuan-Minh Nguyen
- Department of Computer Science, Wayne State University, Detroit, 48202 USA
| | - Adib Shafi
- Department of Computer Science, Wayne State University, Detroit, 48202 USA
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557 USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, 48202 USA
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, 48202 USA
| |
Collapse
|
9
|
Amadoz A, Hidalgo MR, Çubuk C, Carbonell-Caballero J, Dopazo J. A comparison of mechanistic signaling pathway activity analysis methods. Brief Bioinform 2019; 20:1655-1668. [PMID: 29868818 PMCID: PMC6917216 DOI: 10.1093/bib/bby040] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 03/31/2018] [Indexed: 12/11/2022] Open
Abstract
Understanding the aspects of cell functionality that account for disease mechanisms or drug modes of action is a main challenge for precision medicine. Classical gene-based approaches ignore the modular nature of most human traits, whereas conventional pathway enrichment approaches produce only illustrative results of limited practical utility. Recently, a family of new methods has emerged that change the focus from the whole pathways to the definition of elementary subpathways within them that have any mechanistic significance and to the study of their activities. Thus, mechanistic pathway activity (MPA) methods constitute a new paradigm that allows recoding poorly informative genomic measurements into cell activity quantitative values and relate them to phenotypes. Here we provide a review on the MPA methods available and explain their contribution to systems medicine approaches for addressing challenges in the diagnostic and treatment of complex diseases.
Collapse
Affiliation(s)
- Alicia Amadoz
- Department of Bioinformatics, Igenomix S.L., 46980 Valencia, Spain
| | - Marta R Hidalgo
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, Sevilla 41013, Spain
| | - Cankut Çubuk
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, Sevilla 41013, Spain
| | - José Carbonell-Caballero
- Chromatin and Gene expression Lab, Gene Regulation, Stem Cells and Cancer Program, Centre de Regulació Genòmica (CRG), The Barcelona Institute of Science and Technology, PRBB, Barcelona 08003, Spain
| | - Joaquín Dopazo
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, Sevilla 41013, Spain
- Chromatin and Gene expression Lab, Gene Regulation, Stem Cells and Cancer Program, Centre de Regulació Genòmica (CRG), The Barcelona Institute of Science and Technology, PRBB, Barcelona 08003, Spain
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, Sevilla 41013, Spain, Functional Genomics Node (INB), FPS, Hospital Virgen del Rocío, Sevilla 41013, Spain and Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, Sevilla 41013, Spain
| |
Collapse
|
10
|
Nguyen T, Mitrea C, Draghici S. Network-Based Approaches for Pathway Level Analysis. ACTA ACUST UNITED AC 2019; 61:8.25.1-8.25.24. [PMID: 30040185 DOI: 10.1002/cpbi.42] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Identification of impacted pathways is an important problem because it allows us to gain insights into the underlying biology beyond the detection of differentially expressed genes. In the past decade, a plethora of methods have been developed for this purpose. The last generation of pathway analysis methods are designed to take into account various aspects of pathway topology in order to increase the accuracy of the findings. Here, we cover 34 such topology-based pathway analysis methods published in the past 13 years. We compare these methods on categories related to implementation, availability, input format, graph models, and statistical approaches used to compute pathway level statistics and statistical significance. We also discuss a number of critical challenges that need to be addressed, arising both in methodology and pathway representation, including inconsistent terminology, data format, lack of meaningful benchmarks, and, more importantly, a systematic bias that is present in most existing methods. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, Nevada
| | - Cristina Mitrea
- Department of Computer Science, Wayne State University, Detroit, Michigan
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, Michigan.,Department of Obstetrics and Gynecology, Wayne State University, Detroit, Michigan
| |
Collapse
|
11
|
Feng C, Song C, Ning Z, Ai B, Wang Q, Xu Y, Li M, Bai X, Zhao J, Liu Y, Li X, Zhang J, Li C. ce-Subpathway: Identification of ceRNA-mediated subpathways via joint power of ceRNAs and pathway topologies. J Cell Mol Med 2018; 23:967-984. [PMID: 30421585 PMCID: PMC6349186 DOI: 10.1111/jcmm.13997] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 08/28/2018] [Accepted: 10/17/2018] [Indexed: 12/19/2022] Open
Abstract
Competing endogenous RNAs (ceRNAs) represent a novel mechanism of gene regulation that may mediate key subpathway regions and contribute to the altered activities of pathways. However, the classical methods used to identify pathways fail to specifically consider ceRNAs within the pathways and key regions impacted by them. We proposed a powerful strategy named ce-Subpathway for the identification of ceRNA-mediated functional subpathways. It provided an effective level of pathway analysis via integrating ceRNAs, differentially expressed (DE) genes and their key regions within the given pathways. We respectively analysed one pulmonary arterial hypertension (PAH) and one myocardial infarction (MI) data sets and demonstrated that ce-Subpathway could identify many subpathways whose corresponding entire pathways were ignored by those non-ceRNA-mediated pathway identification methods. And these pathways have been well reported to be associated with PAH/MI-related cardiovascular diseases. Further evidence showed reliability of ceRNA interactions and robustness/reproducibility of the ce-Subpathway strategy by several data sets of different cancers, including breast cancer, oesophageal cancer and colon cancer. Survival analysis was finally applied to illustrate the clinical application value of the ceRNA-mediated functional subpathways using another data sets of pancreatic cancer. Comprehensive analyses have shown the power of a joint ceRNAs/DE genes and subpathway strategy based on their topologies.
Collapse
Affiliation(s)
- Chenchen Feng
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Chao Song
- Department of Pharmacology, Daqing Campus, Harbin Medical University, Daqing, China
| | - Ziyu Ning
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Bo Ai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Qiuyu Wang
- School of Nursing, Daqing Campus, Harbin Medical University, Daqing, China
| | - Yong Xu
- The fifth Affiliated Hospital of Harbin Medical University, Daqing, China
| | - Meng Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Xuefeng Bai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Jianmei Zhao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Yuejuan Liu
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Xuecang Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Jian Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Chunquan Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| |
Collapse
|
12
|
Glaab E. Computational systems biology approaches for Parkinson's disease. Cell Tissue Res 2018; 373:91-109. [PMID: 29185073 PMCID: PMC6015628 DOI: 10.1007/s00441-017-2734-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 11/06/2017] [Indexed: 12/26/2022]
Abstract
Parkinson's disease (PD) is a prime example of a complex and heterogeneous disorder, characterized by multifaceted and varied motor- and non-motor symptoms and different possible interplays of genetic and environmental risk factors. While investigations of individual PD-causing mutations and risk factors in isolation are providing important insights to improve our understanding of the molecular mechanisms behind PD, there is a growing consensus that a more complete understanding of these mechanisms will require an integrative modeling of multifactorial disease-associated perturbations in molecular networks. Identifying and interpreting the combinatorial effects of multiple PD-associated molecular changes may pave the way towards an earlier and reliable diagnosis and more effective therapeutic interventions. This review provides an overview of computational systems biology approaches developed in recent years to study multifactorial molecular alterations in complex disorders, with a focus on PD research applications. Strengths and weaknesses of different cellular pathway and network analyses, and multivariate machine learning techniques for investigating PD-related omics data are discussed, and strategies proposed to exploit the synergies of multiple biological knowledge and data sources. A final outlook provides an overview of specific challenges and possible next steps for translating systems biology findings in PD to new omics-based diagnostic tools and targeted, drug-based therapeutic approaches.
Collapse
Affiliation(s)
- Enrico Glaab
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 7 avenue des Hauts Fourneaux, L-4362, Esch-sur-Alzette, Luxembourg.
| |
Collapse
|
13
|
Modeling Transcriptional Rewiring in Neutrophils Through the Course of Treated Juvenile Idiopathic Arthritis. Sci Rep 2018; 8:7805. [PMID: 29773851 PMCID: PMC5958082 DOI: 10.1038/s41598-018-26163-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 05/04/2018] [Indexed: 12/28/2022] Open
Abstract
Neutrophils in children with the polyarticular form of juvenile idiopathic arthritis (JIA) display abnormal transcriptional patterns linked to fundamental metabolic derangements. In this study, we sought to determine the effects of therapy on mRNA and miRNA expression networks in polyarticular JIA. Using exon and miRNA microarrays, we studied children with untreated active JIA (ADU, n = 35), children with active disease on therapy with methotrexate ± etanercept (ADT, n = 26), and children with inactive disease also on therapy (ID, n = 14). We compared the results to findings from healthy control children (HC, n = 35). We found substantial re-ordering of mRNA and miRNA expression networks after the initiation of therapy. Each disease state was associated with a distinct transcriptional profile, with the ADT state differing the most from HC, and ID more strongly resembling HC. Changes at the mRNA level were mirrored in changes in miRNA expression patterns. The analysis of the expression dynamics from differentially expressed genes across three disease states indicated that therapeutic response is a complex process. This process does not simply involve genes slowly correcting in a linear fashion over time. Computational modeling of miRNA and transcription factor (TF) co-regulatory networks demonstrated that combinational regulation of miRNA and TF might play an important role in dynamic transcriptome changes.
Collapse
|
14
|
Park HJ, Kim S, Li W. Model-based analysis of competing-endogenous pathways (MACPath) in human cancers. PLoS Comput Biol 2018; 14:e1006074. [PMID: 29565967 PMCID: PMC5882149 DOI: 10.1371/journal.pcbi.1006074] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 04/03/2018] [Accepted: 03/06/2018] [Indexed: 01/24/2023] Open
Abstract
Competing endogenous RNA (ceRNA) has emerged as an important post-transcriptional mechanism that simultaneously alters expressions of thousands genes in cancers. However, only a few ceRNA genes have been studied for their functions to date. To understand the major biological functions of thousands ceRNA genes as a whole, we designed Model-based Analysis of Competing-endogenous Pathways (MACPath) to infer pathways co-regulated through ceRNA mechanism (cePathways). Our analysis on breast tumors suggested that NGF (nerve growth factor)-induced tumor cell proliferation might be associated with tumor-related growth factor pathways through ceRNA. MACPath also identified indirect cePathways, whose ceRNA relationship is mediated by mediating ceRNAs. Finally, MACPath identified mediating ceRNAs that connect the indirect cePathways based on efficient integer linear programming technique. Mediating ceRNAs are unexpectedly enriched in tumor suppressor genes, whose down-regulation is suspected to disrupt indirect cePathways, such as between DNA replication and WNT signaling pathways. Altogether, MACPath is the first computational method to comprehensively understand functions of thousands ceRNA genes, both direct and indirect, at the pathway level.
Collapse
Affiliation(s)
- Hyun Jung Park
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- * E-mail: (HJP); (WL)
| | - Soyeon Kim
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center, Houston, Texas, United States of America
| | - Wei Li
- Division of Biostatistics, Dan L Duncan Cancer Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (HJP); (WL)
| |
Collapse
|
15
|
Nishiwaki H, Ito M, Negishi S, Sobue S, Ichihara M, Ohno K. Molecular hydrogen upregulates heat shock response and collagen biosynthesis, and downregulates cell cycles: meta-analyses of gene expression profiles. Free Radic Res 2018; 52:434-445. [PMID: 29424253 DOI: 10.1080/10715762.2018.1439166] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Molecular hydrogen exerts its effect on multiple pathologies, including oxidative stress, inflammation, and apoptosis. However, its molecular mechanisms have not been fully elucidated. In order to explore the effects of molecular hydrogen, we meta-analysed gene expression profiles modulated by molecular hydrogen. We performed microarray analysis of the mouse liver with or without drinking hydrogen water. We also integrated two previously reported microarray datasets of the rat liver into meta-analyses. We used two categories of meta-analysis methods: the cross-platform method and the conventional meta-analysis method (Fisher's method). For each method, hydrogen-modulated pathways were analysed by (i) the hypergeometric test (HGT) in the class of over-representation analysis (ORA), (ii) the gene set enrichment analysis (GSEA) in the class of functional class scoring (FCS), and (iii) the signalling pathway impact analysis (SPIA), pathway regulation score (PRS), and others in the class of pathway topology-based approach (PTA). Pathways in the collagen biosynthesis and the heat-shock response were up-regulated according to (a) HGT with the cross-platform method, (b) GSEA with the cross-platform method, and (c) PRS with the cross-platform method. Pathways in cell cycles were down-regulated according to (a) HGT with the cross-platform method, (b) GSEA with the cross-platform method, and (d) GSEA with the conventional meta-analysis method. Because the heat-shock response leads to up-regulation of collagen biosynthesis and a transient arrest of cell cycles, induction of the heat-shock response is likely to be a primary event induced by molecular hydrogen in the liver of wild-type rodents.
Collapse
Affiliation(s)
- Hiroshi Nishiwaki
- a Division of Neurogenetics , Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine , Nagoya , Japan
| | - Mikako Ito
- a Division of Neurogenetics , Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine , Nagoya , Japan
| | - Shuto Negishi
- a Division of Neurogenetics , Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine , Nagoya , Japan
| | - Sayaka Sobue
- b Department of Biomedical Sciences , College of Life and Health Sciences, Chubu University , Kasugai , Japan
| | - Masatoshi Ichihara
- b Department of Biomedical Sciences , College of Life and Health Sciences, Chubu University , Kasugai , Japan
| | - Kinji Ohno
- a Division of Neurogenetics , Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine , Nagoya , Japan
| |
Collapse
|
16
|
Hidalgo MR, Cubuk C, Amadoz A, Salavert F, Carbonell-Caballero J, Dopazo J. High throughput estimation of functional cell activities reveals disease mechanisms and predicts relevant clinical outcomes. Oncotarget 2018; 8:5160-5178. [PMID: 28042959 PMCID: PMC5354899 DOI: 10.18632/oncotarget.14107] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 11/21/2016] [Indexed: 12/21/2022] Open
Abstract
Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is a main challenge for precision medicine. Here we propose a new method that models cell signaling using biological knowledge on signal transduction. The method recodes individual gene expression values (and/or gene mutations) into accurate measurements of changes in the activity of signaling circuits, which ultimately constitute high-throughput estimations of cell functionalities caused by gene activity within the pathway. Moreover, such estimations can be obtained either at cohort-level, in case/control comparisons, or personalized for individual patients. The accuracy of the method is demonstrated in an extensive analysis involving 5640 patients from 12 different cancer types. Circuit activity measurements not only have a high diagnostic value but also can be related to relevant disease outcomes such as survival, and can be used to assess therapeutic interventions.
Collapse
Affiliation(s)
- Marta R Hidalgo
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, 46012, Spain
| | - Cankut Cubuk
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, 46012, Spain
| | - Alicia Amadoz
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, 46012, Spain.,Functional Genomics Node (INB-ELIXIR-es), Valencia, 46012, Spain
| | - Francisco Salavert
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, 46012, Spain.,Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia, 46012, Spain
| | - José Carbonell-Caballero
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, 46012, Spain
| | - Joaquin Dopazo
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, 46012, Spain.,Functional Genomics Node (INB-ELIXIR-es), Valencia, 46012, Spain.,Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia, 46012, Spain
| |
Collapse
|
17
|
Diao X, Liu A. Identification of core pathways based on attractor and crosstalk in ischemic stroke. Exp Ther Med 2018; 15:1520-1524. [PMID: 29434737 PMCID: PMC5776172 DOI: 10.3892/etm.2017.5563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 11/21/2017] [Indexed: 01/17/2023] Open
Abstract
Ischemic stroke is a leading cause of mortality and disability around the world. It is an important task to identify dysregulated pathways which infer molecular and functional insights existing in high-throughput experimental data. Gene expression profile of E-GEOD-16561 was collected. Pathways were obtained from the database of Kyoto Encyclopedia of Genes and Genomes and Retrieval of Interacting Genes was used to download protein-protein interaction sets. Attractor and crosstalk approaches were applied to screen dysregulated pathways. A total of 20 differentially expressed genes were identified in ischemic stroke. Thirty-nine significant differential pathways were identified according to P<0.01 and 28 pathways were identified with RP<0.01 and 17 pathways were identified with impact factor >250. On the basis of the three criteria, 11 significant dysfunctional pathways were identified. Among them, Epstein-Barr virus infection was the most significant differential pathway. In conclusion, with the method based on attractor and crosstalk, significantly dysfunctional pathways were identified. These pathways are expected to provide molecular mechanism of ischemic stroke and represents a novel potential therapeutic target for ischemic stroke treatment.
Collapse
Affiliation(s)
- Xiufang Diao
- Department of Respiratory Medicine, Weifang People's Hospital, Weifang, Shandong 261041, P.R. China
| | - Aijuan Liu
- Department of Cardiology, Weifang People's Hospital, Weifang, Shandong 261041, P.R. China
| |
Collapse
|
18
|
Kim S. Identifying dynamic pathway interactions based on clinical information. Comput Biol Chem 2017; 68:260-265. [PMID: 28463775 DOI: 10.1016/j.compbiolchem.2017.04.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Revised: 04/16/2017] [Accepted: 04/17/2017] [Indexed: 10/19/2022]
Abstract
In this paper, we introduce approaches for inferring dynamic pathway interactions by converting static datasets into dynamic datasets using patients' clinical information. One approach uses survival time-based dynamic datasets, and the other uses grade- and stage-based dynamic datasets. Based on cancer grades and stages, we generated six dynamic levels and obtained two pairs of significant pathways out of twelve enriched pathways. One pair of the pathways included CELL ADHESION MOLECULES CAMS and SYSTEMIC LUPUS ERYTHEMATOSUS (correlation coefficient=1.00), in which CD28, CD86, HLA-DOA, and HLA-DOB were identified as common genes in the pathways. The other pair of the pathways included SPLICEOSOME and PRIMARY IMMUNODEFICIENCY (correlation coefficient=0.94) with no common genes identified.
Collapse
Affiliation(s)
- Shinuk Kim
- Department of Civil Engineering, Sangmyung University, Cheonan Chungnam 31066, Republic of Korea.
| |
Collapse
|
19
|
Abstract
De novo pathway enrichment is a powerful approach to discover previously uncharacterized molecular mechanisms in addition to already known pathways. To achieve this, condition-specific functional modules are extracted from large interaction networks. Here, we give an overview of the state of the art and present the first framework for assessing the performance of existing methods. We identified 19 tools and selected seven representative candidates for a comparative analysis with more than 12,000 runs, spanning different biological networks, molecular profiles, and parameters. Our results show that none of the methods consistently outperforms the others. To mitigate this issue for biomedical researchers, we provide guidelines to choose the appropriate tool for a given dataset. Moreover, our framework is the first attempt for a quantitative evaluation of de novo methods, which will allow the bioinformatics community to objectively compare future tools against the state of the art. De novo pathway enrichment methods are essential to understand disease complexity. They can uncover disease-specific functional modules by integrating molecular interaction networks with expression profiles. However, how should researchers choose one method out of several? In this article, a group of scientists from Denmark and Germany presents the first attempt to quantitatively evaluate existing methods. This framework will help the biomedical community to find the appropriate tool(s) for their data. They created synthetic gold standards and simulated expression profiles to perform a systematic assessment of various tools. They observed that the choice of interaction network, parameter settings, preprocessing of expression data and statistical properties of the expression profiles influence the results to a large extent. The results reveal strengths and limitations of the individual methods and suggest using two or more tools to obtain comprehensive disease-modules.
Collapse
|
20
|
Investigation of coordination and order in transcription regulation of innate and adaptive immunity genes in type 1 diabetes. BMC Med Genomics 2017; 10:7. [PMID: 28143555 PMCID: PMC5282641 DOI: 10.1186/s12920-017-0243-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 01/25/2017] [Indexed: 01/19/2023] Open
Abstract
Background Type 1 diabetes (T1D) is an autoimmune disease and extensive evidence has indicated a critical role of both the innate and the adaptive arms of immune system in disease development. To date most clinical trials of immunomodulation therapies failed to show efficacy. A number of gene expression studies of T1D have been carried out. However, a systems analysis of the expression variations of the innate and adaptive immunity gene sets, or their co-expression network structures in cohorts at different disease states or of different disease risks, is not available till now. Methods We utilized data from a large gene expression study that included transcription profiles of control peripheral blood mononuclear cells (PBMC) exposed to plasma of 148 human subjects from four cohorts that included unrelated healthy controls (uHC), recent onset T1D patients (RO-T1D), and healthy siblings of probands that possess high (HRS, High Risk Sibling) or low (LRS, Low Risk Sibling) risk HLA haplotypes. Both weighted and non-weighted co-expression networks were constructed in each cohort separately, and edge weight distribution and the activation of known protein complexes were examined. The co-expression networks of the innate and adaptive immunity genes were further examined in more detail through a number of network measures that included network density, Shannon entropy, h-index, and the scaling exponent γ of degree distribution. Pathway analysis was carried out using CoGA, a tool for detecting significant network structural changes of a gene set. Results Weighted network edge distribution revealed a globally weakened co-expression network induced by the RO-T1D cohort as compared to that by the uHC, suggesting a broad spectrum loss of transcriptional coordination. The two healthy T1D family cohorts (HRS and LRS) induced more active but heterogeneous transcription coordination globally, and among both the innate and the adaptive immunity genes, than the uHC. This finding is consistent with our previous report of these cohorts sharing a heightened innate inflammatory state. The spike-in of IL-1RA to RO-T1D sera improved co-expression network strength of both the innate and the adaptive immunity genes, and enabled a global order recovery in transcription regulation that resulted in significantly increased number of activated protein complexes. Many of the top pathways that showed significant difference in co-expression network structures and order between RO-T1D and uHC have strong links to T1D. Conclusions Network level analysis of the innate and adaptive immunity genes, and the whole genome, revealed striking cohort-dependent differences in co-expression network structural measures, suggesting their potential in cohort classification and disease-relevant pathway identification. The results demonstrated the advantages of systems analysis in defining molecular signatures as well as in predicting targets in future research. Electronic supplementary material The online version of this article (doi:10.1186/s12920-017-0243-8) contains supplementary material, which is available to authorized users.
Collapse
|
21
|
Kaushik A, Ali S, Gupta D. Altered Pathway Analyzer: A gene expression dataset analysis tool for identification and prioritization of differentially regulated and network rewired pathways. Sci Rep 2017; 7:40450. [PMID: 28084397 PMCID: PMC5233954 DOI: 10.1038/srep40450] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Accepted: 12/07/2016] [Indexed: 12/13/2022] Open
Abstract
Gene connection rewiring is an essential feature of gene network dynamics. Apart from its normal functional role, it may also lead to dysregulated functional states by disturbing pathway homeostasis. Very few computational tools measure rewiring within gene co-expression and its corresponding regulatory networks in order to identify and prioritize altered pathways which may or may not be differentially regulated. We have developed Altered Pathway Analyzer (APA), a microarray dataset analysis tool for identification and prioritization of altered pathways, including those which are differentially regulated by TFs, by quantifying rewired sub-network topology. Moreover, APA also helps in re-prioritization of APA shortlisted altered pathways enriched with context-specific genes. We performed APA analysis of simulated datasets and p53 status NCI-60 cell line microarray data to demonstrate potential of APA for identification of several case-specific altered pathways. APA analysis reveals several altered pathways not detected by other tools evaluated by us. APA analysis of unrelated prostate cancer datasets identifies sample-specific as well as conserved altered biological processes, mainly associated with lipid metabolism, cellular differentiation and proliferation. APA is designed as a cross platform tool which may be transparently customized to perform pathway analysis in different gene expression datasets. APA is freely available at http://bioinfo.icgeb.res.in/APA.
Collapse
Affiliation(s)
- Abhinav Kaushik
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi 110067, India
| | - Shakir Ali
- Department of Biochemistry, Jamia Hamdard, Deemed University, New Delhi 110062, India
| | - Dinesh Gupta
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi 110067, India
| |
Collapse
|
22
|
Wang Y, Lin S, Li C, Li Y, Chen L, Wang Y. A Novel Method for Pathway Identification Based on Attractor and Crosstalk in Polyarticular Juvenile Idiopathic Arthritis. Med Sci Monit 2016; 22:4152-4158. [PMID: 27804927 PMCID: PMC5103838 DOI: 10.12659/msm.897792] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Background Juvenile idiopathic arthritis (JIA) is one of the most common inflammatory disorders of unknown etiology. We introduced a novel method to identify dysregulated pathways associated with polyarticular JIA (pJIA). Material/Methods Gene expression profiling of 61 children with pJIA and 59 healthy controls were collected from E-GEOD-13849; 300 pathways were obtained from Kyoto Encyclopedia of Genes and Genomes (KEGG) database and 787,896 protein-protein interaction sets were gathered from the Retrieval of Interacting Genes. Attractor and crosstalk were designed to complement each other to increase the integrity of pathways assessment. Then, impact factor was used to assess the interactions inter-pathways, and RP-value was used to evaluate the comprehensive influential ability of attractors. Results There were seven attractors with p<0.01 and 14 pathways with RP<0.01. Finally, two significantly dysfunctional pathways were found, which were related to pJIA progression: p53 signaling pathway (KEGG ID: 04115) and non-alcoholic fatty liver disease (NAFLD) (KEGG ID: 04932). Conclusions A novel approach that identified the dysregulated pathways in pJIA was constructed based on attractor and crosstalk. The new process is expected to be efficient in the upcoming era of medicine.
Collapse
Affiliation(s)
- Yuanji Wang
- Department of Orthopaedics, The People's Hospital of Rizhao, Rizhao, Shandong, China (mainland)
| | - Shunhua Lin
- Department of Orthopaedics, The People's Hospital of Rizhao, Rizhao, Shandong, China (mainland)
| | - Changhui Li
- Department of Orthopaedics, The People's Hospital of Rizhao, Rizhao, Shandong, China (mainland)
| | - Yizhao Li
- Department of Orthopaedics, The People's Hospital of Rizhao, Rizhao, Shandong, China (mainland)
| | - Lei Chen
- Department of Orthopaedics, The People's Hospital of Rizhao, Rizhao, Shandong, China (mainland)
| | - Yingzhen Wang
- Department of Joint Surgery, The Affiliated Hospital of Qingdao University, Qingdao, Shandong, China (mainland)
| |
Collapse
|
23
|
Voyle N, Keohane A, Newhouse S, Lunnon K, Johnston C, Soininen H, Kloszewska I, Mecocci P, Tsolaki M, Vellas B, Lovestone S, Hodges A, Kiddle S, Dobson RJ. A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis. J Alzheimers Dis 2016; 49:659-69. [PMID: 26484910 PMCID: PMC4927941 DOI: 10.3233/jad-150440] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Background: Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer’s disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. Objectives: This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Methods: Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Results: Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Conclusions: Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.
Collapse
Affiliation(s)
- Nicola Voyle
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,MRC Social, Genetic and Developmental Psychiatry Centre, King's College London, London, UK
| | - Aoife Keohane
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Stephen Newhouse
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,NIHR Biomedical Research Centre for Mental Health and Biomedical Research Unit for Dementia at South London and Maudsley NHS Foundation, London, UK
| | | | - Caroline Johnston
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,NIHR Biomedical Research Centre for Mental Health and Biomedical Research Unit for Dementia at South London and Maudsley NHS Foundation, London, UK
| | - Hilkka Soininen
- Department of Neurology, University of Eastern Finland and Kuopio University Hospital, Kuopio, Finland
| | | | - Patrizia Mecocci
- Institute of Gerontology and Geriatrics, University of Perugia, Perugia, Italy
| | - Magda Tsolaki
- 3rd Department of Neurology, Aristotle University, Thessaloniki, Greece
| | | | - Simon Lovestone
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,Department of Pyschiatry, Oxford University, Oxford, UK
| | - Angela Hodges
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Steven Kiddle
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,MRC Social, Genetic and Developmental Psychiatry Centre, King's College London, London, UK
| | - Richard Jb Dobson
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,NIHR Biomedical Research Centre for Mental Health and Biomedical Research Unit for Dementia at South London and Maudsley NHS Foundation, London, UK
| |
Collapse
|
24
|
Chen HR, Sherr DH, Hu Z, DeLisi C. A network based approach to drug repositioning identifies plausible candidates for breast cancer and prostate cancer. BMC Med Genomics 2016; 9:51. [PMID: 27475327 PMCID: PMC4967295 DOI: 10.1186/s12920-016-0212-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 07/20/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The high cost and the long time required to bring drugs into commerce is driving efforts to repurpose FDA approved drugs-to find new uses for which they weren't intended, and to thereby reduce the overall cost of commercialization, and shorten the lag between drug discovery and availability. We report on the development, testing and application of a promising new approach to repositioning. METHODS Our approach is based on mining a human functional linkage network for inversely correlated modules of drug and disease gene targets. The method takes account of multiple information sources, including gene mutation, gene expression, and functional connectivity and proximity of within module genes. RESULTS The method was used to identify candidates for treating breast and prostate cancer. We found that (i) the recall rate for FDA approved drugs for breast (prostate) cancer is 20/20 (10/11), while the rates for drugs in clinical trials were 131/154 and 82/106; (ii) the ROC/AUC performance substantially exceeds that of comparable methods; (iii) preliminary in vitro studies indicate that 5/5 candidates have therapeutic indices superior to that of Doxorubicin in MCF7 and SUM149 cancer cell lines. We briefly discuss the biological plausibility of the candidates at a molecular level in the context of the biological processes that they mediate. CONCLUSIONS Our method appears to offer promise for the identification of multi-targeted drug candidates that can correct aberrant cellular functions. In particular the computational performance exceeded that of other CMap-based methods, and in vitro experiments indicate that 5/5 candidates have therapeutic indices superior to that of Doxorubicin in MCF7 and SUM149 cancer cell lines. The approach has the potential to provide a more efficient drug discovery pipeline.
Collapse
Affiliation(s)
- Hsiao-Rong Chen
- Bioinformatics Program, College of Engineering, Boston University, Boston, MA, USA.,Graduate Program in Translational Molecular Medicine, Boston University School of Medicine, Boston, MA, USA
| | - David H Sherr
- Department of Environmental Health, Boston University School of Public Health, Boston, MA, USA
| | - Zhenjun Hu
- Bioinformatics Program, College of Engineering, Boston University, Boston, MA, USA
| | - Charles DeLisi
- Bioinformatics Program, College of Engineering, Boston University, Boston, MA, USA. .,Department of Biomedical Engineering, Boston University, Boston, MA, USA.
| |
Collapse
|
25
|
Ruan J, Jahid MJ, Gu F, Lei C, Huang YW, Hsu YT, Mutch DG, Chen CL, Kirma NB, Huang THM. A novel algorithm for network-based prediction of cancer recurrence. Genomics 2016; 111:17-23. [PMID: 27453286 PMCID: PMC5253120 DOI: 10.1016/j.ygeno.2016.07.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Revised: 07/08/2016] [Accepted: 07/18/2016] [Indexed: 10/21/2022]
Abstract
To develop accurate prognostic models is one of the biggest challenges in "omics"-based cancer research. Here, we propose a novel computational method for identifying dysregulated gene subnetworks as biomarkers to predict cancer recurrence. Applying our method to the DNA methylome of endometrial cancer patients, we identified a subnetwork consisting of differentially methylated (DM) genes, and non-differentially methylated genes, termed Epigenetic Connectors (EC), that are topologically important for connecting the DM genes in a protein-protein interaction network. The ECs are statistically significantly enriched in well-known tumorgenesis and metastasis pathways, and include known epigenetic regulators. Importantly, combining the DMs and ECs as features using a novel random walk procedure, we constructed a support vector machine classifier that significantly improved the prediction accuracy of cancer recurrence and outperformed several alternative methods, demonstrating the effectiveness of our network-based approach.
Collapse
Affiliation(s)
- Jianhua Ruan
- Department of Computer Science, University of Texas, San Antonio, TX, USA; Department of Molecular Medicine, University of Texas Health Science Center, San Antonio, TX, USA; Department of Electrical Engineering and Computer Science, McNeese State University, Lake Charles, LA, USA.
| | - Md Jamiul Jahid
- Department of Computer Science, University of Texas, San Antonio, TX, USA
| | - Fei Gu
- Department of Molecular Medicine, University of Texas Health Science Center, San Antonio, TX, USA
| | - Chengwei Lei
- Department of Electrical Engineering and Computer Science, McNeese State University, Lake Charles, LA, USA
| | - Yi-Wen Huang
- Department of Obstetrics and Gynecology, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Ya-Ting Hsu
- Department of Molecular Medicine, University of Texas Health Science Center, San Antonio, TX, USA
| | - David G Mutch
- Department of Obstetrics and Gynecology, Washington University School of Medicine, St. Louis, MO, USA
| | - Chun-Liang Chen
- Department of Molecular Medicine, University of Texas Health Science Center, San Antonio, TX, USA
| | - Nameer B Kirma
- Department of Molecular Medicine, University of Texas Health Science Center, San Antonio, TX, USA
| | - Tim H-M Huang
- Department of Molecular Medicine, University of Texas Health Science Center, San Antonio, TX, USA; Cancer Therapy & Research Center, University of Texas Health Science Center, San Antonio, TX, USA.
| |
Collapse
|
26
|
Gao C, McDowell IC, Zhao S, Brown CD, Engelhardt BE. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering. PLoS Comput Biol 2016; 12:e1004791. [PMID: 27467526 PMCID: PMC4965098 DOI: 10.1371/journal.pcbi.1004791] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Accepted: 02/03/2016] [Indexed: 01/15/2023] Open
Abstract
Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues.
Collapse
Affiliation(s)
- Chuan Gao
- Department of Statistical Science, Duke University, Durham, North Carolina, United States of America
| | - Ian C. McDowell
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
| | - Shiwen Zhao
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
| | - Christopher D. Brown
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Barbara E. Engelhardt
- Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
27
|
Hu Z, Jiang K, Frank MB, Chen Y, Jarvis JN. Complexity and Specificity of the Neutrophil Transcriptomes in Juvenile Idiopathic Arthritis. Sci Rep 2016; 6:27453. [PMID: 27271962 PMCID: PMC4895221 DOI: 10.1038/srep27453] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 05/19/2016] [Indexed: 12/17/2022] Open
Abstract
NIH projects such as ENCODE and Roadmap Epigenomics have revealed surprising complexity in the transcriptomes of mammalian cells. In this study, we explored transcriptional complexity in human neutrophils, cells generally regarded as nonspecific in their functions and responses. We studied distinct human disease phenotypes and found that, at the gene, gene isoform, and miRNA level, neutrophils exhibit considerable specificity in their transcriptomes. Thus, even cells whose responses are considered non-specific show tailoring of their transcriptional repertoire toward specific physiologic or pathologic contexts. We also found that miRNAs had a global impact on neutrophil transcriptome and are associated with innate immunity in juvenile idiopathic arthritis (JIA). These findings have important implications for our understanding of the link between genes, non-coding transcripts and disease phenotypes.
Collapse
Affiliation(s)
- Zihua Hu
- Center for Computational Research, New York State Center of Excellence in Bioinformatics &Life Sciences, State University of New York at Buffalo, Buffalo, NY 14260, USA.,Department of Ophthalmology, Department of Biostatistics, Department of Medicine, State University of New York at Buffalo, Buffalo, NY 14260, USA.,SUNY Eye Institute, Buffalo, NY 14260, USA
| | - Kaiyu Jiang
- Department of Pediatrics, Division of Allergy/Immunology/Rheumatology, University at Buffalo, Buffalo, NY 14203, USA
| | - Mark Barton Frank
- Arthritis &Immunology Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
| | - Yanmin Chen
- Department of Pediatrics, Division of Allergy/Immunology/Rheumatology, University at Buffalo, Buffalo, NY 14203, USA
| | - James N Jarvis
- Department of Pediatrics, Division of Allergy/Immunology/Rheumatology, University at Buffalo, Buffalo, NY 14203, USA.,Graduate Program in Genetics, Genomics, &Bioinformatics, University at Buffalo, Buffalo, NY 14203, USA
| |
Collapse
|
28
|
ToPASeq: an R package for topology-based pathway analysis of microarray and RNA-Seq data. BMC Bioinformatics 2015; 16:350. [PMID: 26514335 PMCID: PMC4625615 DOI: 10.1186/s12859-015-0763-1] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Accepted: 10/07/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Pathway analysis methods, in which differentially expressed genes are mapped to databases of reference pathways and relative enrichment is assessed, help investigators to propose biologically relevant hypotheses. The last generation of pathway analysis methods takes into account the topological structure of a pathway, which helps to increase both specificity and sensitivity of the findings. Simultaneously, the RNA-Seq technology is gaining popularity and becomes widely used for gene expression profiling. Unfortunately, majority of topological pathway analysis methods remains without implementation and if an implementation exists, it is limited in various factors. RESULTS We developed a new R/Bioconductor package ToPASeq offering uniform interface to seven distinct topology-based pathway analysis methods, of which three we implemented de-novo and four were adjusted from existing implementations. Apart this, ToPASeq offers a set of tailored visualization functions and functions for importing and manipulating pathways and their topologies, facilitating the application of the methods on different species. The package can be used to compare the differential expression of pathways between two conditions on both gene expression microarray and RNA-Seq data. The package is written in R and is available from Bioconductor 3.2 using AGPL-3 license. CONCLUSION ToPASeq is a novel package that offers seven distinct methods for topology-based pathway analysis, which are easily applicable on microarray as well as RNA-Seq data, both in human and other species. At the same time, it provides specific tools for visualization of the results.
Collapse
|
29
|
Du N, Jiang K, Sawle AD, Frank MB, Wallace CA, Zhang A, Jarvis JN. Dynamic tracking of functional gene modules in treated juvenile idiopathic arthritis. Genome Med 2015; 7:109. [PMID: 26497493 PMCID: PMC4619406 DOI: 10.1186/s13073-015-0227-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Accepted: 10/01/2015] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND We have previously shown that childhood-onset rheumatic diseases show aberrant patterns of gene expression that reflect pathology-associated co-expression networks. In this study, we used novel computational approaches to examine how disease-associated networks are altered in one of the most common rheumatic diseases of childhood, juvenile idiopathic arthritis (JIA). METHODS Using whole blood gene expression profiles derived from children in a pediatric rheumatology clinical trial, we used a network approach to understanding the impact of therapy and the underlying biology of response/non-response to therapy. RESULTS We demonstrate that therapy for JIA is associated with extensive re-ordering of gene expression networks, even in children who respond inadequately to therapy. Furthermore, we observe distinct differences in the evolution of specific network properties when we compare children who have been treated successfully with those who have inadequate treatment response. CONCLUSIONS Despite the inherent noisiness of whole blood gene expression data, our findings demonstrate how therapeutic response might be mapped and understood in pathologically informative cells in a broad range of human inflammatory diseases.
Collapse
Affiliation(s)
- Nan Du
- Department of Computer Sciences and Engineering, University at Buffalo, Buffalo, NY, USA.
| | - Kaiyu Jiang
- Department of Pediatrics, Rheumatology Research, University at Buffalo School of Medicine, Buffalo, NY, USA.
| | - Ashley D Sawle
- The Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY, 10032, USA.
| | - Mark Barton Frank
- Oklahoma Medical Research Foundation, Clinical Immunology Program, Oklahoma City, OK, USA.
| | - Carol A Wallace
- Department of Pediatrics, University of Washington, Seattle, WA, USA.
| | - Aidong Zhang
- Department of Computer Sciences and Engineering, University at Buffalo, Buffalo, NY, USA.
| | - James N Jarvis
- Department of Pediatrics, Rheumatology Research, University at Buffalo School of Medicine, Buffalo, NY, USA.
- Genetics, Genomics, and Bioinformatics Program, University at Buffalo, Buffalo, NY, USA.
- Pediatric Rheumatology Research, University at Buffalo Clinical & Translational Research Center, 875 Ellicott St, Buffalo, NY, 14203, USA.
| |
Collapse
|
30
|
Maino B, D'Agata V, Severini C, Ciotti MT, Calissano P, Copani A, Chang YC, DeLisi C, Cavallaro S. Igf1 and Pacap rescue cerebellar granule neurons from apoptosis via a common transcriptional program. Cell Death Discov 2015; 1. [PMID: 26941962 PMCID: PMC4773033 DOI: 10.1038/cddiscovery.2015.29] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
A shift of the delicate balance between apoptosis and survival-inducing signals determines the fate of neurons during the development of the central nervous system and its homeostasis throughout adulthood. Both pathways, promoting or protecting from apoptosis, trigger a transcriptional program. We conducted whole-genome expression profiling to decipher the transcriptional regulatory elements controlling the apoptotic/survival switch in cerebellar granule neurons following the induction of apoptosis by serum and potassium deprivation or their rescue by either insulin-like growth factor-1 (Igf1) or pituitary adenylyl cyclase-activating polypeptide (Pacap). Although depending on different upstream signaling pathways, the survival effects of Igf1 and Pacap converged into common transcriptional cascades, thus suggesting the existence of a general transcriptional program underlying neuronal survival.
Collapse
Affiliation(s)
- Barbara Maino
- Institute of Neurological Sciences, Italian National Research Council, 95126 Catania, Italy
| | - Velia D'Agata
- Department of Biomedical and Biotechnological Sciences, Section of Human Anatomy and Histology, University of Catania, 95123 Catania, Italy
| | - Cinzia Severini
- Institute of Neurobiology and Molecular Medicine, Italian National Research Council, 00143 Roma, Italy
| | | | | | - Agata Copani
- Department of Drug Sciences, University of Catania, 95125 Catania, Italy
| | - Yi-Chien Chang
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA
| | - Charles DeLisi
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Sebastiano Cavallaro
- Institute of Neurological Sciences, Italian National Research Council, 95126 Catania, Italy
| |
Collapse
|
31
|
Han J, Shi X, Zhang Y, Xu Y, Jiang Y, Zhang C, Feng L, Yang H, Shang D, Sun Z, Su F, Li C, Li X. ESEA: Discovering the Dysregulated Pathways based on Edge Set Enrichment Analysis. Sci Rep 2015; 5:13044. [PMID: 26267116 PMCID: PMC4533315 DOI: 10.1038/srep13044] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 07/06/2015] [Indexed: 02/06/2023] Open
Abstract
Pathway analyses are playing an increasingly important role in understanding biological mechanism, cellular function and disease states. Current pathway-identification methods generally focus on only the changes of gene expression levels; however, the biological relationships among genes are also the fundamental components of pathways, and the dysregulated relationships may also alter the pathway activities. We propose a powerful computational method, Edge Set Enrichment Analysis (ESEA), for the identification of dysregulated pathways. This provides a novel way of pathway analysis by investigating the changes of biological relationships of pathways in the context of gene expression data. Simulation studies illustrate the power and performance of ESEA under various simulated conditions. Using real datasets from p53 mutation, Type 2 diabetes and lung cancer, we validate effectiveness of ESEA in identifying dysregulated pathways. We further compare our results with five other pathway enrichment analysis methods. With these analyses, we show that ESEA is able to help uncover dysregulated biological pathways underlying complex traits and human diseases via specific use of the dysregulated biological relationships. We develop a freely available R-based tool of ESEA. Currently, ESEA can support pathway analysis of the seven public databases (KEGG; Reactome; Biocarta; NCI; SPIKE; HumanCyc; Panther).
Collapse
Affiliation(s)
- Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Xinrui Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Yanjun Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Ying Jiang
- College of Basic Medical Science, Heilongjiang University of Chinese Medicine, Harbin 150040, PR China
| | - Chunlong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Li Feng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Haixiu Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Desi Shang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Zeguo Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Fei Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| | - Chunquan Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Harbin, 150081, PR China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, PR China
| |
Collapse
|
32
|
Han J, Li C, Yang H, Xu Y, Zhang C, Ma J, Shi X, Liu W, Shang D, Yao Q, Zhang Y, Su F, Feng L, Li X. A novel dysregulated pathway-identification analysis based on global influence of within-pathway effects and crosstalk between pathways. J R Soc Interface 2015; 12:20140937. [PMID: 25551156 DOI: 10.1098/rsif.2014.0937] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Identifying dysregulated pathways from high-throughput experimental data in order to infer underlying biological insights is an important task. Current pathway-identification methods focus on single pathways in isolation; however, consideration of crosstalk between pathways could improve our understanding of alterations in biological states. We propose a novel method of pathway analysis based on global influence (PAGI) to identify dysregulated pathways, by considering both within-pathway effects and crosstalk between pathways. We constructed a global gene–gene network based on the relationships among genes extracted from a pathway database. We then evaluated the extent of differential expression for each gene, and mapped them to the global network. The random walk with restart algorithm was used to calculate the extent of genes affected by global influence. Finally, we used cumulative distribution functions to determine the significance values of the dysregulated pathways. We applied the PAGI method to five cancer microarray datasets, and compared our results with gene set enrichment analysis and five other methods. Based on these analyses, we demonstrated that PAGI can effectively identify dysregulated pathways associated with cancer, with strong reproducibility and robustness. We implemented PAGI using the freely available R-based and Web-based tools (http://bioinfo.hrbmu.edu.cn/PAGI).
Collapse
|
33
|
Di Lena P, Martelli PL, Fariselli P, Casadio R. NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases. BMC Genomics 2015; 16 Suppl 8:S6. [PMID: 26110971 PMCID: PMC4480278 DOI: 10.1186/1471-2164-16-s8-s6] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Background Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions. Results Here we describe a novel procedure based on the extraction from the STRING interactome of sub-networks connecting proteins that share the same Gene Ontology(GO) terms for Biological Process (BP). Enrichment analysis is performed by mapping the protein set to be analyzed on the sub-networks, and then by collecting the corresponding annotations. We test the ability of our enrichment method in finding annotation terms disregarded by other enrichment methods available. We benchmarked 244 sets of proteins associated to different Mendelian diseases, according to the OMIM web resource. In 143 cases (58%), the network-based procedure extracts GO terms neglected by the standard method, and in 86 cases (35%), some of the newly enriched GO terms are not included in the set of annotations characterizing the input proteins. We present in detail six cases where our network-based enrichment provides an insight into the biological basis of the diseases, outperforming other freely available network-based methods. Conclusions Considering a set of proteins in the context of their interaction network can help in better defining their functions. Our novel method exploits the information contained in the STRING database for building the minimal connecting network containing all the proteins annotated with the same GO term. The enrichment procedure is performed considering the GO-specific network modules and, when tested on the OMIM-derived benchmark sets, it is able to extract enrichment terms neglected by other methods. Our procedure is effective even when the size of the input protein set is small, requiring at least two input proteins.
Collapse
|
34
|
Laukens K, Naulaerts S, Berghe WV. Bioinformatics approaches for the functional interpretation of protein lists: from ontology term enrichment to network analysis. Proteomics 2015; 15:981-96. [PMID: 25430566 DOI: 10.1002/pmic.201400296] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Revised: 10/16/2014] [Accepted: 11/24/2014] [Indexed: 12/24/2022]
Abstract
The main result of a great deal of the published proteomics studies is a list of identified proteins, which then needs to be interpreted in relation to the research question and existing knowledge. In the early days of proteomics this interpretation was only based on expert insights, acquired by digesting a large amount of relevant literature. With the growing size and complexity of the experimental datasets, many computational techniques, databases, and tools have claimed a central role in this task. In this review we discuss commonly and less commonly used methods to functionally interpret experimental proteome lists and compare them with available knowledge. We first address several functional analysis and enrichment techniques based on ontologies and literature. Then we outline how various types of network and pathway information can be used. While the problem of functional interpretation of proteome data is to an extent equivalent to the interpretation of transcriptome or other ''omics'' data, this paper addresses some of the specific challenges and solutions of the proteomics field.
Collapse
Affiliation(s)
- Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan, Antwerp, Belgium; Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp / Antwerp University Hospital, Antwerp, Belgium
| | | | | |
Collapse
|
35
|
Tian F, Wang Y, Seiler M, Hu Z. Functional characterization of breast cancer using pathway profiles. BMC Med Genomics 2014; 7:45. [PMID: 25041817 PMCID: PMC4113668 DOI: 10.1186/1755-8794-7-45] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Accepted: 07/09/2014] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The molecular characteristics of human diseases are often represented by a list of genes termed "signature genes". A significant challenge facing this approach is that of reproducibility: signatures developed on a set of patients may fail to perform well on different sets of patients. As diseases are resulted from perturbed cellular functions, irrespective of the particular genes that contribute to the function, it may be more appropriate to characterize diseases based on these perturbed cellular functions. METHODS We proposed a profile-based approach to characterize a disease using a binary vector whose elements indicate whether a given function is perturbed based on the enrichment analysis of expression data between normal and tumor tissues. Using breast cancer and its four primary clinically relevant subtypes as examples, this approach is evaluated based on the reproducibility, accuracy and resolution of the resulting pathway profiles. RESULTS Pathway profiles for breast cancer and its subtypes are constructed based on data obtained from microarray and RNA-Seq data sets provided by The Cancer Genome Atlas (TCGA), and an additional microarray data set provided by The European Genome-phenome Archive (EGA). An average reproducibility of 68% is achieved between different data sets (TCGA microarray vs. EGA microarray data) and 67% average reproducibility is achieved between different technologies (TCGA microarray vs. TCGA RNA-Seq data). Among the enriched pathways, 74% of them are known to be associated with breast cancer or other cancers. About 40% of the identified pathways are enriched in all four subtypes, with 4, 2, 4, and 7 pathways enriched only in luminal A, luminal B, triple-negative, and HER2+ subtypes, respectively. Comparison of profiles between subtypes, as well as other diseases, shows that luminal A and luminal B subtypes are more similar to the HER2+ subtype than to the triple-negative subtype, and subtypes of breast cancer are more likely to be closer to each other than to other diseases. CONCLUSIONS Our results demonstrate that pathway profiles can successfully characterize both common and distinct functional characteristics of four subtypes of breast cancer and other related diseases, with acceptable reproducibility, high accuracy and reasonable resolution.
Collapse
Affiliation(s)
- Feng Tian
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA
| | - Yajie Wang
- Core Laboratory for Clinical Medical Research, Beijing Tiantan Hospital, Capital Medical University, Beijing, P. R. China
- Department of Clinical Laboratory Diagnosis, Beijing Tiantan Hospital, Capital Medical University, Beijing, P. R. China
| | - Michael Seiler
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA
| | - Zhenjun Hu
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA
| |
Collapse
|
36
|
Liu Y, Hu Z. Identification of collaborative driver pathways in breast cancer. BMC Genomics 2014; 15:605. [PMID: 25034939 PMCID: PMC4111852 DOI: 10.1186/1471-2164-15-605] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Accepted: 07/07/2014] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND An important challenge in cancer biology is to computationally screen mutations in cancer cells, separating those that might drive cancer initiation and progression, from the much larger number of bystanders. Since mutations are large in number and diverse in type, the frequency of any particular mutation pattern across a set of samples is low. This makes statistical distinctions and reproducibility across different populations difficult to establish. RESULTS In this paper we develop a novel method that promises to partially ameliorate these problems. The basic idea is although mutations are highly heterogeneous and vary from one sample to another, the processes that are disrupted when cells undergo transformation tend to be invariant across a population for a particular cancer or cancer subtype. Specifically, we focus on finding mutated pathway-groups that are invariant across samples of breast cancer subtypes. The identification of informative pathway-groups consists of two steps. The first is identification of pathways significantly enriched in genes containing non-synonymous mutations; the second uses pathways so identified to find groups that are functionally related in the largest number of samples. An application to 4 subtypes of breast cancer identified pathway-groups that can highly explicate a particular subtype and rich in processes associated with transformation. CONCLUSIONS In contrast to previous methods that identify pathways across a set of samples without any further validation, we show that mutated pathway-groups can be found in each breast cancer subtype and that such groups are invariant across the majority of samples. The algorithm is available at http://www.visantnet.org/misi/MUDPAC.zip.
Collapse
Affiliation(s)
- Yang Liu
- Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, 24 Cummington Mall, Boston, MA 02215 USA
| | - Zhenjun Hu
- Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, 24 Cummington Mall, Boston, MA 02215 USA
| |
Collapse
|
37
|
Li J, Li C, Han J, Zhang C, Shang D, Yao Q, Zhang Y, Xu Y, Liu W, Zhou M, Yang H, Su F, Li X. The detection of risk pathways, regulated by miRNAs, via the integration of sample-matched miRNA-mRNA profiles and pathway structure. J Biomed Inform 2014; 49:187-97. [PMID: 24561483 DOI: 10.1016/j.jbi.2014.02.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Revised: 12/17/2013] [Accepted: 02/03/2014] [Indexed: 11/26/2022]
Abstract
The use of genome-wide, sample-matched miRNA (miRNAs)-mRNA expression data provides a powerful tool for the investigation of miRNAs and genes involved in diseases. The identification of miRNA-regulated pathways has been crucial for analysis of the role of miRNAs. However, the classical identification method fails to consider the structural information of pathways and the regulation of miRNAs simultaneously. We proposed a method that simultaneously integrated the change in gene expression and structural information in order to identify pathways. Our method used fold changes in miRNAs and gene products, along with the quantification of the regulatory effect on target genes, to measure the change in gene expression. Topological characteristics were investigated to measure the influence of gene products on entire pathways. Through the analysis of multiple myeloma and prostate cancer expression data, our method was proven to be effective and reliable in identifying disease risk pathways that are regulated by miRNAs. Further analysis showed that the structure of a pathway plays a crucial role in the recognition of the pathway as a factor in disease risk.
Collapse
Affiliation(s)
- Jing Li
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China; Department of Bioinformatics, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, PR China
| | - Chunquan Li
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Junwei Han
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Chunlong Zhang
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Desi Shang
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Qianlan Yao
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Yanjun Xu
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Wei Liu
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Meng Zhou
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Haixiu Yang
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Fei Su
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
| | - Xia Li
- College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China.
| |
Collapse
|
38
|
Pyatnitskiy M, Mazo I, Shkrob M, Schwartz E, Kotelnikova E. Clustering gene expression regulators: new approach to disease subtyping. PLoS One 2014; 9:e84955. [PMID: 24416320 PMCID: PMC3887006 DOI: 10.1371/journal.pone.0084955] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Accepted: 11/20/2013] [Indexed: 12/29/2022] Open
Abstract
One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Collapse
Affiliation(s)
- Mikhail Pyatnitskiy
- Institute of Biomedical Chemistry, RAMS, Moscow, Russia
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
- * E-mail:
| | - Ilya Mazo
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
| | - Maria Shkrob
- Elsevier Inc, Rockville, Maryland, United States of America
| | - Elena Schwartz
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
| | - Ekaterina Kotelnikova
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
- Institute for Information Transmission Problems, RAS, Moscow, Russia
| |
Collapse
|
39
|
Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M, Voichiţa C, Drăghici S. Methods and approaches in the topology-based analysis of biological pathways. Front Physiol 2013; 4:278. [PMID: 24133454 PMCID: PMC3794382 DOI: 10.3389/fphys.2013.00278] [Citation(s) in RCA: 136] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 09/15/2013] [Indexed: 11/21/2022] Open
Abstract
The goal of pathway analysis is to identify the pathways significantly impacted in a given phenotype. Many current methods are based on algorithms that consider pathways as simple gene lists, dramatically under-utilizing the knowledge that such pathways are meant to capture. During the past few years, a plethora of methods claiming to incorporate various aspects of the pathway topology have been proposed. These topology-based methods, sometimes referred to as “third generation,” have the potential to better model the phenomena described by pathways. Although there is now a large variety of approaches used for this purpose, no review is currently available to offer guidance for potential users and developers. This review covers 22 such topology-based pathway analysis methods published in the last decade. We compare these methods based on: type of pathways analyzed (e.g., signaling or metabolic), input (subset of genes, all genes, fold changes, gene p-values, etc.), mathematical models, pathway scoring approaches, output (one or more pathway scores, p-values, etc.) and implementation (web-based, standalone, etc.). We identify and discuss challenges, arising both in methodology and in pathway representation, including inconsistent terminology, different data formats, lack of meaningful benchmarks, and the lack of tissue and condition specificity.
Collapse
Affiliation(s)
- Cristina Mitrea
- Department of Computer Science, Wayne State University Detroit, MI, USA
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Liu W, Li C, Xu Y, Yang H, Yao Q, Han J, Shang D, Zhang C, Su F, Li X, Xiao Y, Zhang F, Dai M, Li X. Topologically inferring risk-active pathways toward precise cancer classification by directed random walk. ACTA ACUST UNITED AC 2013; 29:2169-77. [PMID: 23842813 DOI: 10.1093/bioinformatics/btt373] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
MOTIVATION The accurate prediction of disease status is a central challenge in clinical cancer research. Microarray-based gene biomarkers have been identified to predict outcome and outperform traditional clinical parameters. However, the robustness of the individual gene biomarkers is questioned because of their little reproducibility between different cohorts of patients. Substantial progress in treatment requires advances in methods to identify robust biomarkers. Several methods incorporating pathway information have been proposed to identify robust pathway markers and build classifiers at the level of functional categories rather than of individual genes. However, current methods consider the pathways as simple gene sets but ignore the pathway topological information, which is essential to infer a more robust pathway activity. RESULTS Here, we propose a directed random walk (DRW)-based method to infer the pathway activity. DRW evaluates the topological importance of each gene by capturing the structure information embedded in the directed pathway network. The strategy of weighting genes by their topological importance greatly improved the reproducibility of pathway activities. Experiments on 18 cancer datasets showed that the proposed method yielded a more accurate and robust overall performance compared with several existing gene-based and pathway-based classification methods. The resulting risk-active pathways are more reliable in guiding therapeutic selection and the development of pathway-specific therapeutic strategies. AVAILABILITY DRW is freely available at http://210.46.85.180:8080/DRWPClass/
Collapse
Affiliation(s)
- Wei Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Cancer systems biology in the genome sequencing era: part 1, dissecting and modeling of tumor clones and their networks. Semin Cancer Biol 2013; 23:279-85. [PMID: 23791722 DOI: 10.1016/j.semcancer.2013.06.002] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 06/04/2013] [Accepted: 06/09/2013] [Indexed: 02/05/2023]
Abstract
Recent tumor genome sequencing confirmed that one tumor often consists of multiple cell subpopulations (clones) which bear different, but related, genetic profiles such as mutation and copy number variation profiles. Thus far, one tumor has been viewed as a whole entity in cancer functional studies. With the advances of genome sequencing and computational analysis, we are able to quantify and computationally dissect clones from tumors, and then conduct clone-based analysis. Emerging technologies such as single-cell genome sequencing and RNA-Seq could profile tumor clones. Thus, we should reconsider how to conduct cancer systems biology studies in the genome sequencing era. We will outline new directions for conducting cancer systems biology by considering that genome sequencing technology can be used for dissecting, quantifying and genetically characterizing clones from tumors. Topics discussed in Part 1 of this review include computationally quantifying of tumor subpopulations; clone-based network modeling, cancer hallmark-based networks and their high-order rewiring principles and the principles of cell survival networks of fast-growing clones.
Collapse
|
42
|
Abstract
Life science technologies generate a deluge of data that hold the keys to unlocking the secrets of important biological functions and disease mechanisms. We present DEAP, Differential Expression Analysis for Pathways, which capitalizes on information about biological pathways to identify important regulatory patterns from differential expression data. DEAP makes significant improvements over existing approaches by including information about pathway structure and discovering the most differentially expressed portion of the pathway. On simulated data, DEAP significantly outperformed traditional methods: with high differential expression, DEAP increased power by two orders of magnitude; with very low differential expression, DEAP doubled the power. DEAP performance was illustrated on two different gene and protein expression studies. DEAP discovered fourteen important pathways related to chronic obstructive pulmonary disease and interferon treatment that existing approaches omitted. On the interferon study, DEAP guided focus towards a four protein path within the 26 protein Notch signalling pathway. The data deluge represents a growing challenge for life sciences. Within this sea of data surely lie many secrets to understanding important biological and medical systems. To quantify important patterns in this data, we present DEAP (Differential Expression Analysis for Pathways). DEAP amalgamates information about biological pathway structure and differential expression to identify important patterns of regulation. On both simulated and biological data, we show that DEAP is able to identify key mechanisms while making significant improvements over existing methodologies. For example, on the interferon study, DEAP uniquely identified both the interferon gamma signalling pathway and the JAK STAT signalling pathway.
Collapse
|
43
|
Chang B, Kustra R, Tian W. Functional-network-based gene set analysis using gene-ontology. PLoS One 2013; 8:e55635. [PMID: 23418449 PMCID: PMC3572115 DOI: 10.1371/journal.pone.0055635] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2012] [Accepted: 12/31/2012] [Indexed: 11/19/2022] Open
Abstract
To account for the functional non-equivalence among a set of genes within a biological pathway when performing gene set analysis, we introduce GOGANPA, a network-based gene set analysis method, which up-weights genes with functions relevant to the gene set of interest. The genes are weighted according to its degree within a genome-scale functional network constructed using the functional annotations available from the gene ontology database. By benchmarking GOGANPA using a well-studied P53 data set and three breast cancer data sets, we will demonstrate the power and reproducibility of our proposed method over traditional unweighted approaches and a competing network-based approach that involves a complex integrated network. GOGANPA’s sole reliance on gene ontology further allows GOGANPA to be widely applicable to the analysis of any gene-ontology-annotated genome.
Collapse
Affiliation(s)
- Billy Chang
- State Key Laboratory of Genetic Engineering, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, P.R. China
- Dalla Lana School of Public Health, Division of Biostatistics, University of Toronto, Toronto, Ontario, Canada
| | - Rafal Kustra
- Dalla Lana School of Public Health, Division of Biostatistics, University of Toronto, Toronto, Ontario, Canada
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, P.R. China
- * E-mail:
| |
Collapse
|
44
|
Abstract
Thanks for the dramatic reduction of the costs of high-throughput techniques in modern biotechnology, searching for differentially expressed genes is already a common procedure in identifying biomarkers or signatures of phenotypic states such as diseases or compound treatments. However, in most of the cases, especially in complex diseases, even given a list of biomarkers, the underlying biological mechanisms are still obscure to us. In other words, rather than knowing what genes are involved, we are more interested in discovering the common, collective roles of all these genes. Based on the assumption that genes involved in the same biological processes, functions, or localizations present correlated behaviors in terms of expression levels, signal intensities, allele occurrences, and so on, we can therefore apply statistical tests to find perturbed pathways. Gene Set/Pathway enrichment analysis is one of such techniques; a step-by-step instruction is described in this chapter.
Collapse
Affiliation(s)
- Jui-Hung Hung
- Program in Bioinformatics and Integrative Biology, Worcester, MA, USA.
| |
Collapse
|
45
|
Wang PI, Hwang S, Kincaid RP, Sullivan CS, Lee I, Marcotte EM. RIDDLE: reflective diffusion and local extension reveal functional associations for unannotated gene sets via proximity in a gene network. Genome Biol 2012; 13:R125. [PMID: 23268829 PMCID: PMC4056375 DOI: 10.1186/gb-2012-13-12-r125] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 12/26/2012] [Indexed: 01/08/2023] Open
Abstract
The growing availability of large-scale functional networks has promoted the development of many successful techniques for predicting functions of genes. Here we extend these network-based principles and techniques to functionally characterize whole sets of genes. We present RIDDLE (Reflective Diffusion and Local Extension), which uses well developed guilt-by-association principles upon a human gene network to identify associations of gene sets. RIDDLE is particularly adept at characterizing sets with no annotations, a major challenge where most traditional set analyses fail. Notably, RIDDLE found microRNA-450a to be strongly implicated in ocular diseases and development. A web application is available at http://www.functionalnet.org/RIDDLE.
Collapse
|
46
|
Gao S, Jia S, Hessner MJ, Wang X. Predicting disease-related subnetworks for type 1 diabetes using a new network activity score. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012; 16:566-78. [PMID: 22917479 DOI: 10.1089/omi.2012.0029] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
In this study we investigated the advantage of including network information in prioritizing disease genes of type 1 diabetes (T1D). First, a naïve Bayesian network (NBN) model was developed to integrate information from multiple data sources and to define a T1D-involvement probability score (PS) for each individual gene. The algorithm was validated using known functional candidate genes as a benchmark. Genes with higher PS were found to be more likely to appear in T1D-related publications. Next a new network activity metric was proposed to evaluate the T1D relevance of protein-protein interaction (PPI) subnetworks. The metric considered the contribution both from individual genes and from network topological characteristics. The predictions were confirmed by several independent datasets, including a genome wide association study (GWAS), and two large-scale human gene expression studies. We found that novel candidate genes in the T1D subnetworks showed more significant associations with T1D than genes predicted using PS alone. Interestingly, most novel candidates were not encoded within the human leukocyte antigen (HLA) region, and their expression levels showed correlation with disease only in cohorts with low-risk HLA genotypes. The results suggested the importance of mapping disease gene networks in dissecting the genetics of complex diseases, and offered a general approach to network-based disease gene prioritization from multiple data sources.
Collapse
Affiliation(s)
- Shouguo Gao
- Department of Physics, the University of Alabama at Birmingham, Birmingham, Alabama 35294, USA
| | | | | | | |
Collapse
|
47
|
Kim S, Kon M, DeLisi C. Pathway-based classification of cancer subtypes. Biol Direct 2012; 7:21. [PMID: 22759382 PMCID: PMC3485163 DOI: 10.1186/1745-6150-7-21] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 05/15/2012] [Indexed: 12/21/2022] Open
Abstract
Background Molecular markers based on gene expression profiles have been used in experimental and clinical settings to distinguish cancerous tumors in stage, grade, survival time, metastasis, and drug sensitivity. However, most significant gene markers are unstable (not reproducible) among data sets. We introduce a standardized method for representing cancer markers as 2-level hierarchical feature vectors, with a basic gene level as well as a second level of (more stable) pathway markers, for the purpose of discriminating cancer subtypes. This extends standard gene expression arrays with new pathway-level activation features obtained directly from off-the-shelf gene set enrichment algorithms such as GSEA. Such so-called pathway-based expression arrays are significantly more reproducible across datasets. Such reproducibility will be important for clinical usefulness of genomic markers, and augment currently accepted cancer classification protocols. Results The present method produced more stable (reproducible) pathway-based markers for discriminating breast cancer metastasis and ovarian cancer survival time. Between two datasets for breast cancer metastasis, the intersection of standard significant gene biomarkers totaled 7.47% of selected genes, compared to 17.65% using pathway-based markers; the corresponding percentages for ovarian cancer datasets were 20.65% and 33.33% respectively. Three pathways, consisting of Type_1_diabetes mellitus, Cytokine-cytokine_receptor_interaction and Hedgehog_signaling (all previously implicated in cancer), are enriched in both the ovarian long survival and breast non-metastasis groups. In addition, integrating pathway and gene information, we identified five (ID4, ANXA4, CXCL9, MYLK, FBXL7) and six (SQLE, E2F1, PTTG1, TSTA3, BUB1B, MAD2L1) known cancer genes significant for ovarian and breast cancer respectively. Conclusions Standardizing the analysis of genomic data in the process of cancer staging, classification and analysis is important as it has implications for both pre-clinical as well as clinical studies. The paradigm of diagnosis and prediction using pathway-based biomarkers as features can be an important part of the process of biomarker-based cancer analysis, and the resulting canonical (clinically reproducible) biomarkers can be important in standardizing genomic data. We expect that identification of such canonical biomarkers will improve clinical utility of high-throughput datasets for diagnostic and prognostic applications. Reviewers This article was reviewed by John McDonald (nominated by I. King Jordon), Eugene Koonin, Nathan Bowen (nominated by I. King Jordon), and Ekaterina Kotelnikova (nominated by Mikhail Gelfand).
Collapse
Affiliation(s)
- Shinuk Kim
- Bioinformatics program, Boston University, Boston, MA 02215, USA
| | | | | |
Collapse
|
48
|
Tang H, Zhong F, Xie H. A quick guide to biomolecular network studies: construction, analysis, applications, and resources. Biochem Biophys Res Commun 2012; 424:7-11. [PMID: 22732414 DOI: 10.1016/j.bbrc.2012.06.085] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Accepted: 06/18/2012] [Indexed: 10/28/2022]
Abstract
Over the past decade, a rapid increase in network data including signaling, transcription regulation, metabolic reaction, protein-protein interaction and genetic interaction has been observed. Many biology issues have been investigated by analyzing these diverse networks, providing new insights into biology. Networks also play an important role in disease studies including disease gene screening and clinical diagnosis. Large amounts of databases and software have been developed to facilitate the storage, exchange, integration, and analysis of network data and network analysis is becoming a routine procedure for biologists to infer biological information. In this review, several main aspects of network studies are discussed, including network construction, analysis, application, and resources.
Collapse
Affiliation(s)
- Hailin Tang
- College of Mechanical & Electronic Engineering and Automatization, National University of Defense Technology, Changsha 410073, China
| | | | | |
Collapse
|
49
|
Gu Z, Liu J, Cao K, Zhang J, Wang J. Centrality-based pathway enrichment: a systematic approach for finding significant pathways dominated by key genes. BMC SYSTEMS BIOLOGY 2012; 6:56. [PMID: 22672776 PMCID: PMC3443660 DOI: 10.1186/1752-0509-6-56] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 05/24/2012] [Indexed: 12/18/2022]
Abstract
Background Biological pathways are important for understanding biological mechanisms. Thus, finding important pathways that underlie biological problems helps researchers to focus on the most relevant sets of genes. Pathways resemble networks with complicated structures, but most of the existing pathway enrichment tools ignore topological information embedded within pathways, which limits their applicability. Results A systematic and extensible pathway enrichment method in which nodes are weighted by network centrality was proposed. We demonstrate how choice of pathway structure and centrality measurement, as well as the presence of key genes, affects pathway significance. We emphasize two improvements of our method over current methods. First, allowing for the diversity of genes’ characters and the difficulty of covering gene importance from all aspects, we set centrality as an optional parameter in the model. Second, nodes rather than genes form the basic unit of pathways, such that one node can be composed of several genes and one gene may reside in different nodes. By comparing our methodology to the original enrichment method using both simulation data and real-world data, we demonstrate the efficacy of our method in finding new pathways from biological perspective. Conclusions Our method can benefit the systematic analysis of biological pathways and help to extract more meaningful information from gene expression data. The algorithm has been implemented as an R package CePa, and also a web-based version of CePa is provided.
Collapse
Affiliation(s)
- Zuguang Gu
- The State Key Laboratory of Pharmaceutical Biotechnology and Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, School of Life Science, Nanjing University, Nanjing, 210093, China
| | | | | | | | | |
Collapse
|
50
|
Huang CL, Lamb J, Chindelevitch L, Kostrowicki J, Guinney J, DeLisi C, Ziemek D. Correlation set analysis: detecting active regulators in disease populations using prior causal knowledge. BMC Bioinformatics 2012; 13:46. [PMID: 22443377 PMCID: PMC3382432 DOI: 10.1186/1471-2105-13-46] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2011] [Accepted: 03/23/2012] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Identification of active causal regulators is a crucial problem in understanding mechanism of diseases or finding drug targets. Methods that infer causal regulators directly from primary data have been proposed and successfully validated in some cases. These methods necessarily require very large sample sizes or a mix of different data types. Recent studies have shown that prior biological knowledge can successfully boost a method's ability to find regulators. RESULTS We present a simple data-driven method, Correlation Set Analysis (CSA), for comprehensively detecting active regulators in disease populations by integrating co-expression analysis and a specific type of literature-derived causal relationships. Instead of investigating the co-expression level between regulators and their regulatees, we focus on coherence of regulatees of a regulator. Using simulated datasets we show that our method performs very well at recovering even weak regulatory relationships with a low false discovery rate. Using three separate real biological datasets we were able to recover well known and as yet undescribed, active regulators for each disease population. The results are represented as a rank-ordered list of regulators, and reveals both single and higher-order regulatory relationships. CONCLUSIONS CSA is an intuitive data-driven way of selecting directed perturbation experiments that are relevant to a disease population of interest and represent a starting point for further investigation. Our findings demonstrate that combining co-expression analysis on regulatee sets with a literature-derived network can successfully identify causal regulators and help develop possible hypothesis to explain disease progression.
Collapse
Affiliation(s)
- Chia-Ling Huang
- Bioinformatics Graduate Program, and Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA
| | - John Lamb
- Oncology Research Unit, Worldwide Research & Development, Pfizer, 10646 Science center Drive, San Diego, CA 92121, USA
| | - Leonid Chindelevitch
- Computational Sciences Center of Emphasis, Worldwide Research & Development, Pfizer, 35 Cambridgepark Drive, Cambridge, MA 02140, USA
| | - Jarek Kostrowicki
- Oncology Research Unit, Worldwide Research & Development, Pfizer, 10646 Science center Drive, San Diego, CA 92121, USA
| | - Justin Guinney
- Sage Bionetworks, 1100 Fairview Ave North, Seattle, WA 98109, USA
| | - Charles DeLisi
- Bioinformatics Graduate Program, and Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA
| | - Daniel Ziemek
- Computational Sciences Center of Emphasis, Worldwide Research & Development, Pfizer, 35 Cambridgepark Drive, Cambridge, MA 02140, USA
| |
Collapse
|