1
|
Chitra U, Park TY, Raphael BJ. NetMix2: A Principled Network Propagation Algorithm for Identifying Altered Subnetworks. J Comput Biol 2022; 29:1305-1323. [PMID: 36525308 PMCID: PMC9917315 DOI: 10.1089/cmb.2022.0336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
A standard paradigm in computational biology is to leverage interaction networks as prior knowledge in analyzing high-throughput biological data, where the data give a score for each vertex in the network. One classical approach is the identification of altered subnetworks, or subnetworks of the interaction network that have both outlier vertex scores and a defined network topology. One class of algorithms for identifying altered subnetworks search for high-scoring subnetworks in subnetwork families with simple topological constraints, such as connected subnetworks, and have sound statistical guarantees. A second class of algorithms employ network propagation-the smoothing of vertex scores over the network using a random walk or diffusion process-and utilize the global structure of the network. However, network propagation algorithms often rely on ad hoc heuristics that lack a rigorous statistical foundation. In this work, we unify the subnetwork family and network propagation approaches by deriving the propagation family, a subnetwork family that approximates the sets of vertices ranked highly by network propagation approaches. We introduce NetMix2, a principled algorithm for identifying altered subnetworks from a wide range of subnetwork families. When using the propagation family, NetMix2 combines the advantages of the subnetwork family and network propagation approaches. NetMix2 outperforms other methods, including network propagation on simulated data, pan-cancer somatic mutation data, and genome-wide association data from multiple human diseases.
Collapse
Affiliation(s)
- Uthsav Chitra
- Department of Computer Science, Princeton University, Princeton, New Jersey, USA
| | - Tae Yoon Park
- Department of Computer Science, Princeton University, Princeton, New Jersey, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, USA
| | - Benjamin J. Raphael
- Department of Computer Science, Princeton University, Princeton, New Jersey, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|
2
|
Xu M, Bermea KC, Ayati M, Kim HB, Yang X, Medina A, Fu Z, Heravi A, Zhang X, Na CH, Everett AD, Gabrielson K, Foster DB, Paolocci N, Murphy AM, Ramirez-Correa GA. Alteration in tyrosine phosphorylation of cardiac proteome and EGFR pathway contribute to hypertrophic cardiomyopathy. Commun Biol 2022; 5:1251. [PMID: 36380187 PMCID: PMC9666710 DOI: 10.1038/s42003-022-04021-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 09/22/2022] [Indexed: 11/16/2022] Open
Abstract
Alterations of serine/threonine phosphorylation of the cardiac proteome are a hallmark of heart failure. However, the contribution of tyrosine phosphorylation (pTyr) to the pathogenesis of cardiac hypertrophy remains unclear. We use global mapping to discover and quantify site-specific pTyr in two cardiac hypertrophic mouse models, i.e., cardiac overexpression of ErbB2 (TgErbB2) and α myosin heavy chain R403Q (R403Q-αMyHC Tg), compared to control hearts. From this, there are significant phosphoproteomic alterations in TgErbB2 mice in right ventricular cardiomyopathy, hypertrophic cardiomyopathy (HCM), and dilated cardiomyopathy (DCM) pathways. On the other hand, R403Q-αMyHC Tg mice indicated that the EGFR1 pathway is central for cardiac hypertrophy, along with angiopoietin, ErbB, growth hormone, and chemokine signaling pathways activation. Surprisingly, most myofilament proteins have downregulation of pTyr rather than upregulation. Kinase-substrate enrichment analysis (KSEA) shows a marked downregulation of MAPK pathway activity downstream of k-Ras in TgErbB2 mice and activation of EGFR, focal adhesion, PDGFR, and actin cytoskeleton pathways. In vivo ErbB2 inhibition by AG-825 decreases cardiomyocyte disarray. Serine/threonine and tyrosine phosphoproteome confirm the above-described pathways and the effectiveness of AG-825 Treatment. Thus, altered pTyr may play a regulatory role in cardiac hypertrophic models.
Collapse
Affiliation(s)
- Mingguo Xu
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA ,Department of Pediatrics, The Third People’s Hospital of Longgang District, Shenzhen, 518115 China
| | - Kevin C. Bermea
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Marzieh Ayati
- grid.449717.80000 0004 5374 269XDeparment of Computer Science/College of Engineering and Computer Science, University of Texas Rio Grande Valley School of Medicine, Edinburgh, Texas USA
| | - Han Byeol Kim
- grid.21107.350000 0001 2171 9311Department of Neurology/Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Xiaomei Yang
- grid.27255.370000 0004 1761 1174Department of Anesthesiology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Ji’nan, China
| | - Andres Medina
- Department of Molecular Science/UT Health Rio Grande Valley, McAllen, TX USA
| | - Zongming Fu
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Hematology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Amir Heravi
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Xinyu Zhang
- grid.27255.370000 0004 1761 1174Department of Cardiology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Ji’nan, China
| | - Chan Hyun Na
- grid.21107.350000 0001 2171 9311Department of Neurology/Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD USA ,grid.21107.350000 0001 2171 9311Department of Biological Chemistry/McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Allen D. Everett
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Kathleen Gabrielson
- grid.21107.350000 0001 2171 9311Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - D. Brian Foster
- grid.21107.350000 0001 2171 9311Department of Medicine/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Nazareno Paolocci
- grid.21107.350000 0001 2171 9311Department of Medicine/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA ,grid.5608.b0000 0004 1757 3470Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Anne M. Murphy
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Genaro A. Ramirez-Correa
- grid.21107.350000 0001 2171 9311Department of Pediatrics/Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD USA ,Department of Molecular Science/UT Health Rio Grande Valley, McAllen, TX USA
| |
Collapse
|
3
|
Ayati M, Chance MR, Koyutürk M. Co-phosphorylation networks reveal subtype-specific signaling modules in breast cancer. Bioinformatics 2021; 37:221-228. [PMID: 32730576 DOI: 10.1093/bioinformatics/btaa678] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 07/10/2020] [Accepted: 07/22/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Protein phosphorylation is a ubiquitous mechanism of post-translational modification that plays a central role in cellular signaling. Phosphorylation is particularly important in the context of cancer, as downregulation of tumor suppressors and upregulation of oncogenes by the dysregulation of associated kinase and phosphatase networks are shown to have key roles in tumor growth and progression. Despite recent advances that enable large-scale monitoring of protein phosphorylation, these data are not fully incorporated into such computational tasks as phenotyping and subtyping of cancers. RESULTS We develop a network-based algorithm, CoPPNet, to enable unsupervised subtyping of cancers using phosphorylation data. For this purpose, we integrate prior knowledge on evolutionary, structural and functional association of phosphosites, kinase-substrate associations and protein-protein interactions with the correlation of phosphorylation of phosphosites across different tumor samples (a.k.a co-phosphorylation) to construct a context-specific-weighted network of phosphosites. We then mine these networks to identify subnetworks with correlated phosphorylation patterns. We apply the proposed framework to two mass-spectrometry-based phosphorylation datasets for breast cancer (BC), and observe that (i) the phosphorylation pattern of the identified subnetworks are highly correlated with clinically identified subtypes, and (ii) the identified subnetworks are highly reproducible across datasets that are derived from different studies. Our results show that integration of quantitative phosphorylation data with network frameworks can provide mechanistic insights into the differences between the signaling mechanisms that drive BC subtypes. Furthermore, the reproducibility of the identified subnetworks suggests that phosphorylation can provide robust classification of disease response and markers. AVAILABILITY AND IMPLEMENTATION CoPPNet is available at http://compbio.case.edu/coppnet/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marzieh Ayati
- Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
| | - Mark R Chance
- Department of Nutrition, Case Western Reserve University, Cleveland, OH 44106, USA.,Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA.,Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Mehmet Koyutürk
- Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA.,Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH 44106, USA.,Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
4
|
Reyna MA, Chitra U, Elyanow R, Raphael BJ. NetMix: A Network-Structured Mixture Model for Reduced-Bias Estimation of Altered Subnetworks. J Comput Biol 2021; 28:469-484. [PMID: 33400606 DOI: 10.1089/cmb.2020.0435] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A classic problem in computational biology is the identification of altered subnetworks: subnetworks of an interaction network that contain genes/proteins that are differentially expressed, highly mutated, or otherwise aberrant compared with other genes/proteins. Numerous methods have been developed to solve this problem under various assumptions, but the statistical properties of these methods are often unknown. For example, some widely used methods are reported to output very large subnetworks that are difficult to interpret biologically. In this work, we formulate the identification of altered subnetworks as the problem of estimating the parameters of a class of probability distributions that we call the Altered Subset Distribution (ASD). We derive a connection between a popular method, jActiveModules, and the maximum likelihood estimator (MLE) of the ASD. We show that the MLE is statistically biased, explaining the large subnetworks output by jActiveModules. Based on these insights, we introduce NetMix, an algorithm that uses Gaussian mixture models to obtain less biased estimates of the parameters of the ASD. We demonstrate that NetMix outperforms existing methods in identifying altered subnetworks on both simulated and real data, including the identification of differentially expressed genes from both microarray and RNA-seq experiments and the identification of cancer driver genes in somatic mutation data.
Collapse
Affiliation(s)
- Matthew A Reyna
- Department of Biomedical Informatics, Emory University, Atlanta, Georgia, USA
| | - Uthsav Chitra
- Department of Computer Science, Princeton University, Princeton, New Jersey, USA
| | - Rebecca Elyanow
- Department of Computer Science, Princeton University, Princeton, New Jersey, USA
- Department of Computer Science, Brown University, Providence, Rhode Island, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|
5
|
Liu D, Skomorovska Y, Song J, Bowler E, Harris R, Ravasz M, Bai S, Ayati M, Tamai K, Koyuturk M, Yuan X, Wang Z, Wang Y, Ewing R. ELF3 is an antagonist of oncogenic-signalling-induced expression of EMT-TF ZEB1. Cancer Biol Ther 2018; 20:90-100. [PMID: 30148686 PMCID: PMC6292503 DOI: 10.1080/15384047.2018.1507256] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 06/22/2018] [Accepted: 07/29/2018] [Indexed: 12/23/2022] Open
Abstract
Background: Epithelial-to-mesenchymal transition (EMT) is a key step in the transformation of epithelial cells into migratory and invasive tumour cells. Intricate positive and negative regulatory processes regulate EMT. Many oncogenic signalling pathways can induce EMT, but the specific mechanisms of how this occurs, and how this process is controlled are not fully understood. Methods: RNA-Seq analysis, computational analysis of protein networks and large-scale cancer genomics datasets were used to identify ELF3 as a negative regulator of the expression of EMT markers. Western blotting coupled to siRNA as well as analysis of tumour/normal colorectal cancer panels was used to investigate the expression and function of ELF3. Results: RNA-Seq analysis of colorectal cancer cells expressing mutant and wild-type β-catenin and analysis of colorectal cancer cells expressing inducible mutant RAS showed that ELF3 expression is reduced in response to oncogenic signalling and antagonizes Wnt and RAS oncogenic signalling pathways. Analysis of gene-expression patterns across The Cancer Genome Atlas (TCGA) and protein localization in colorectal cancer tumour panels showed that ELF3 expression is anti-correlated with β-catenin and markers of EMT and correlates with better clinical prognosis. Conclusions: ELF3 is a negative regulator of the EMT transcription factor (EMT-TF) ZEB1 through its function as an antagonist of oncogenic signalling.
Collapse
Affiliation(s)
- D Liu
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Y Skomorovska
- School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - J Song
- School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - E Bowler
- School of Biological Sciences, Faculty of Natural and Environmental Sciences, University of Southampton, Southampton, UK
| | - R Harris
- School of Biological Sciences, Faculty of Natural and Environmental Sciences, University of Southampton, Southampton, UK
| | - M Ravasz
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - S Bai
- School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - M Ayati
- Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, USA
| | - K Tamai
- School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - M Koyuturk
- Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, USA
| | - X Yuan
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Z Wang
- School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Y Wang
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- School of Biological Sciences, Faculty of Natural and Environmental Sciences, University of Southampton, Southampton, UK
| | - R.M. Ewing
- School of Biological Sciences, Faculty of Natural and Environmental Sciences, University of Southampton, Southampton, UK
| |
Collapse
|
6
|
Wang CE, Wang JQ, Luo YJ. Systemic tracking of diagnostic function modules for post-menopausal osteoporosis in a differential co-expression network view. Exp Ther Med 2018; 15:2961-2967. [PMID: 29599833 PMCID: PMC5867453 DOI: 10.3892/etm.2018.5787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 01/02/2018] [Indexed: 12/20/2022] Open
Abstract
Post-menopausal osteoporosis is one of the most common bone diseases in women. The aim of the present study was to predict the diagnostic function modules from a differential co-expression gene network in order to enhance the current understanding of the biological processes and to promote the early prevention and intervention of post-menopausal osteoporosis. The diagnostic function modules were extracted from a differential co-expression network by the established protein-protein interaction (PPI) network analysis. First, significant genes were identified from the differential co-expression network, which were regarded as seed genes. Starting from the seed genes, the sub-networks in this disease, referred to as diagnostic function modules, were exhaustively searched and prioritized through a snowball sampling strategy to identify genes to accurately predict clinical outcomes. In addition, crucial function inference was performed for each diagnostic function module. Based on the microarray and PPI data, the differential co-expression network was constructed, which contained 1,607 genes and 4,197 interactions. A total of 110 seed genes were identified, and nine diagnostic modules that accurately distinguished post-menopausal osteoporosis from healthy controls were screened out from these seed genes. The diagnostic modules may be associated with five functional pathways with emphasis on metabolism. A total of nine diagnostic functional modules screened in the present study may be considered as potential targets for predicting the clinical outcomes of post-menopausal osteoporosis, and may contribute to the early diagnosis and therapy of osteoporosis.
Collapse
Affiliation(s)
- Chuan-En Wang
- Department of Minimally Invasive Spine Surgery, Sport Hospital Attached to Chengdu Sport University, Chengdu, Sichuan 610041, P.R. China
| | - Jin-Qiang Wang
- Department of Spine Surgery, Weifang Traditional Chinese Hospital, Weifang, Shandong 261041, P.R. China
| | - Yuan-Jian Luo
- Department of Vertebrae Disease Surgery, The First People's Hospital of Yulin, Yulin, Guangxi 537000, P.R. China
| |
Collapse
|
7
|
Wiredja DD, Ayati M, Mazhar S, Sangodkar J, Maxwell S, Schlatzer D, Narla G, Koyutürk M, Chance MR. Phosphoproteomics Profiling of Nonsmall Cell Lung Cancer Cells Treated with a Novel Phosphatase Activator. Proteomics 2017; 17. [PMID: 28961369 DOI: 10.1002/pmic.201700214] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 09/07/2017] [Indexed: 01/17/2023]
Abstract
Activation of protein phosphatase 2A (PP2A) is a promising anticancer therapeutic strategy, as this tumor suppressor has the ability to coordinately downregulate multiple pathways involved in the regulation of cellular growth and proliferation. In order to understand the systems-level perturbations mediated by PP2A activation, we carried out mass spectrometry-based phosphoproteomic analysis of two KRAS mutated non-small cell lung cancer (NSCLC) cell lines (A549 and H358) treated with a novel small molecule activator of PP2A (SMAP). Overall, this permitted quantification of differential signaling across over 1600 phosphoproteins and 3000 phosphosites. Kinase activity assessment and pathway enrichment implicate collective downregulation of RAS and cell cycle kinases in the case of both cell lines upon PP2A activation. However, the effects on RAS-related signaling are attenuated for A549 compared to H358, while the effects on cell cycle-related kinases are noticeably more prominent in A549. Network-based analyses and validation experiments confirm these detailed differences in signaling. These studies reveal the power of phosphoproteomics studies, coupled to computational systems biology, to elucidate global patterns of phosphatase activation and understand the variations in response to PP2A activation across genetically similar NSCLC cell lines.
Collapse
Affiliation(s)
- Danica D Wiredja
- Center for Proteomics and Bioinformatics, Department of Nutrition, Case Western Reserve University, Cleveland, OH, USA
| | - Marzieh Ayati
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA
| | - Sahar Mazhar
- Department of Pathology, Case Western Reserve University,, Cleveland, OH, USA
| | - Jaya Sangodkar
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY, USA
| | - Sean Maxwell
- Center for Proteomics and Bioinformatics, Department of Nutrition, Case Western Reserve University, Cleveland, OH, USA
| | - Daniela Schlatzer
- Center for Proteomics and Bioinformatics, Department of Nutrition, Case Western Reserve University, Cleveland, OH, USA
| | - Goutham Narla
- Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, USA
| | - Mehmet Koyutürk
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA
| | - Mark R Chance
- Center for Proteomics and Bioinformatics, Department of Nutrition, Case Western Reserve University, Cleveland, OH, USA.,Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
8
|
Ayati M, Koyutürk M. PoCos: Population Covering Locus Sets for Risk Assessment in Complex Diseases. PLoS Comput Biol 2016; 12:e1005195. [PMID: 27835645 PMCID: PMC5105987 DOI: 10.1371/journal.pcbi.1005195] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Accepted: 10/11/2016] [Indexed: 12/17/2022] Open
Abstract
Susceptibility loci identified by GWAS generally account for a limited fraction of heritability. Predictive models based on identified loci also have modest success in risk assessment and therefore are of limited practical use. Many methods have been developed to overcome these limitations by incorporating prior biological knowledge. However, most of the information utilized by these methods is at the level of genes, limiting analyses to variants that are in or proximate to coding regions. We propose a new method that integrates protein protein interaction (PPI) as well as expression quantitative trait loci (eQTL) data to identify sets of functionally related loci that are collectively associated with a trait of interest. We call such sets of loci “population covering locus sets” (PoCos). The contributions of the proposed approach are three-fold: 1) We consider all possible genotype models for each locus, thereby enabling identification of combinatorial relationships between multiple loci. 2) We develop a framework for the integration of PPI and eQTL into a heterogenous network model, enabling efficient identification of functionally related variants that are associated with the disease. 3) We develop a novel method to integrate the genotypes of multiple loci in a PoCo into a representative genotype to be used in risk assessment. We test the proposed framework in the context of risk assessment for seven complex diseases, type 1 diabetes (T1D), type 2 diabetes (T2D), psoriasis (PS), bipolar disorder (BD), coronary artery disease (CAD), hypertension (HT), and multiple sclerosis (MS). Our results show that the proposed method significantly outperforms individual variant based risk assessment models as well as the state-of-the-art polygenic score. We also show that incorporation of eQTL data improves the performance of identified POCOs in risk assessment. We also assess the biological relevance of PoCos for three diseases that have similar biological mechanisms and identify novel candidate genes. The resulting software is publicly available at http://compbio.case.edu/pocos/. Several studies try to predict the individual disease risk using genetic data obtained from genome wide association studies (GWAS). Earlier studies only focus on individual genetic variants. However, studies on disease mechanisms suggest the aggregation of genomic variants may contribute to diseases. For this reason, researchers commonly use prior biological knowledge to identify genetic variants that are functionally related. However, these approaches are often limited to variants that are in the coding regions of genes. However, several risk variants are in the regulatory region. Here, we incorporate known regulatory and functional interactions to find sets of genetic variants which are informative features for risk assessment. Our result on seven complex diseases show that our method outperforms individual variant based risk assessment models, as well as other methods that integrate multiple genetic variants.
Collapse
Affiliation(s)
- Marzieh Ayati
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, Ohio, United States of America
- * E-mail:
| | - Mehmet Koyutürk
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, Ohio, United States of America
- Center of Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, Ohio, United States of America
| |
Collapse
|