151
|
Chen B, Li M, Wang J, Shang X, Wu FX. A fast and high performance multiple data integration algorithm for identifying human disease genes. BMC Med Genomics 2015; 8 Suppl 3:S2. [PMID: 26399620 PMCID: PMC4582601 DOI: 10.1186/1755-8794-8-s3-s2] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. RESULTS In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. CONCLUSIONS The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms.
Collapse
Affiliation(s)
- Bolin Chen
- School of Computer Science, Northwestern Polytechnical University, 127 Youyi West Road, 710072, Xi'an, P.R. China
| | - Min Li
- School of Information Science and Engineering, Central South University, 410083, Changsha, P.R.China
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, 410083, Changsha, P.R.China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, 127 Youyi West Road, 710072, Xi'an, P.R. China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Dr., S7N 5A9, Saskatoon, Canada
- Department of Mechanical Engineering, University of Saskatchewan, 57 Campus Dr., S7N 5A9, Saskatoon, Canada
| |
Collapse
|
152
|
ProSim: A Method for Prioritizing Disease Genes Based on Protein Proximity and Disease Similarity. BIOMED RESEARCH INTERNATIONAL 2015; 2015:213750. [PMID: 26339594 PMCID: PMC4538409 DOI: 10.1155/2015/213750] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 01/16/2015] [Indexed: 01/19/2023]
Abstract
Predicting disease genes for a particular genetic disease is very challenging in bioinformatics. Based on current research studies, this challenge can be tackled via network-based approaches. Furthermore, it has been highlighted that it is necessary to consider disease similarity along with the protein's proximity to disease genes in a protein-protein interaction (PPI) network in order to improve the accuracy of disease gene prioritization. In this study we propose a new algorithm called proximity disease similarity algorithm (ProSim), which takes both of the aforementioned properties into consideration, to prioritize disease genes. To illustrate the proposed algorithm, we have conducted six case studies, namely, prostate cancer, Alzheimer's disease, diabetes mellitus type 2, breast cancer, colorectal cancer, and lung cancer. We employed leave-one-out cross validation, mean enrichment, tenfold cross validation, and ROC curves to evaluate our proposed method and other existing methods. The results show that our proposed method outperforms existing methods such as PRINCE, RWR, and DADA.
Collapse
|
153
|
Ullah MZ, Aono M, Seddiqui MH. Estimating a ranked list of human hereditary diseases for clinical phenotypes by using weighted bipartite network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2013:3475-8. [PMID: 24110477 DOI: 10.1109/embc.2013.6610290] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
With the availability of the huge medical knowledge data on the Internet such as the human disease network, protein-protein interaction (PPI) network, and phenotypegene, gene-disease bipartite networks, it becomes practical to help doctors by suggesting plausible hereditary diseases for a set of clinical phenotypes. However, identifying candidate diseases that best explain a set of clinical phenotypes by considering various heterogeneous networks is still a challenging task. In this paper, we propose a new method for estimating a ranked list of plausible diseases by associating phenotypegene with gene-disease bipartite networks. Our approach is to count the frequency of all the paths from a phenotype to a disease through their associated causative genes, and link the phenotype to the disease with path frequency in a new phenotype-disease bipartite (PDB) network. After that, we generate the candidate weights for the edges of phenotypes with diseases in PDB network. We evaluate our proposed method in terms of Normalized Discounted Cumulative Gain (NDCG), and demonstrate that we outperform the previously known disease ranking method called Phenomizer.
Collapse
|
154
|
Wang Z, Clark NR, Ma'ayan A. Dynamics of the discovery process of protein-protein interactions from low content studies. BMC SYSTEMS BIOLOGY 2015; 9:26. [PMID: 26048415 PMCID: PMC4456804 DOI: 10.1186/s12918-015-0173-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 05/29/2015] [Indexed: 12/20/2022]
Abstract
Background Thousands of biological and biomedical investigators study of the functional role of single genes and their protein products in normal physiology and in disease. The findings from these studies are reported in research articles that stimulate new research. It is now established that a complex regulatory networks's is controlling human cellular fate, and this community of researchers are continually unraveling this network topology. Attempts to integrate results from such accumulated knowledge resulted in literature-based protein-protein interaction networks (PPINs) and pathway databases. These databases are widely used by the community to analyze new data collected from emerging genome-wide studies with the assumption that the data within these literature-based databases is the ground truth and contain no biases. While suspicion for research focus biases is growing, a concrete proof for it is still missing. It is difficult to prove because the real PPINs are mostly unknown. Results Here we analyzed the longitudinal discovery process of literature-based mammalian and yeast PPINs to observe that these networks are discovered non-uniformly. The pattern of discovery is related to a theoretical concept proposed by Kauffman called “expanding the adjacent possible”. We introduce a network discovery model which explicitly includes the space of possibilities in the form of a true underlying PPIN. Conclusions Our model strongly suggests that research focus biases exist in the observed discovery dynamics of these networks. In summary, more care should be placed when using PPIN databases for analysis of newly acquired data, and when considering prior knowledge when designing new experiments. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0173-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY, 10029, USA. .,BD2K-LINCS Data Coordination and Integration Center, New York, USA. .,Knowledge Management Center for the Illuminating the Druggable Genome project, New York, USA.
| | - Neil R Clark
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY, 10029, USA. .,BD2K-LINCS Data Coordination and Integration Center, New York, USA. .,Knowledge Management Center for the Illuminating the Druggable Genome project, New York, USA.
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY, 10029, USA. .,BD2K-LINCS Data Coordination and Integration Center, New York, USA. .,Knowledge Management Center for the Illuminating the Druggable Genome project, New York, USA.
| |
Collapse
|
155
|
Luo XJ, Huang L, van den Oord EJ, Aberg KA, Gan L, Zhao Z, Yao YG. Common variants in the MKL1 gene confer risk of schizophrenia. Schizophr Bull 2015; 41:715-27. [PMID: 25380769 PMCID: PMC4393692 DOI: 10.1093/schbul/sbu156] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Genome-wide association studies (GWAS) of schizophrenia have identified multiple risk variants with robust association signals for schizophrenia. However, these variants could explain only a small proportion of schizophrenia heritability. Furthermore, the effect size of these risk variants is relatively small (eg, most of them had an OR less than 1.2), suggesting that additional risk variants may be detected when increasing sample size in analysis. Here, we report the identification of a genome-wide significant schizophrenia risk locus at 22q13.1 by combining 2 large-scale schizophrenia cohort studies. Our meta-analysis revealed that 7 single nucleotide polymorphism (SNPs) on chromosome 22q13.1 reached the genome-wide significance level (P < 5.0×10(-8)) in the combined samples (a total of 38441 individuals). Among them, SNP rs6001946 had the most significant association with schizophrenia (P = 2.04×10(-8)). Interestingly, all 7 SNPs are in high linkage disequilibrium and located in the MKL1 gene. Expression analysis showed that MKL1 is highly expressed in human and mouse brains. We further investigated functional links between MKL1 and proteins encoded by other schizophrenia susceptibility genes in the whole human protein interaction network. We found that MKL1 physically interacts with GSK3B, a protein encoded by a well-characterized schizophrenia susceptibility gene. Collectively, our results revealed that genetic variants in MKL1 might confer risk to schizophrenia. Further investigation of the roles of MKL1 in the pathogenesis of schizophrenia is warranted.
Collapse
Affiliation(s)
- Xiong-jian Luo
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences & Yunnan Province, Kunming Institute of Zoology, Kunming, Yunnan, China;,*To whom correspondence should be addressed; Key Laboratory of Animal Models and Human Disease Mechanisms, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China; tel: 86-871-65180085, fax: 86-871-65180085, e-mail:
| | - Liang Huang
- First Affiliated Hospital of Gannan Medical University, Ganzhou, Jiangxi 341000, China
| | - Edwin J. van den Oord
- Center for Biomarker Research and Personalized Medicine, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Karolina A. Aberg
- Center for Biomarker Research and Personalized Medicine, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Lin Gan
- Flaum Eye Institute and Department of Ophthalmology, University of Rochester, Rochester, NY 14642, USA
| | - Zhongming Zhao
- Departments of Biomedical Informatics and Psychiatry, Vanderbilt University School of Medicine, Nashville, TN
| | - Yong-Gang Yao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences & Yunnan Province, Kunming Institute of Zoology, Kunming, Yunnan, China
| |
Collapse
|
156
|
Ghiassian SD, Menche J, Barabási AL. A DIseAse MOdule Detection (DIAMOnD) algorithm derived from a systematic analysis of connectivity patterns of disease proteins in the human interactome. PLoS Comput Biol 2015; 11:e1004120. [PMID: 25853560 PMCID: PMC4390154 DOI: 10.1371/journal.pcbi.1004120] [Citation(s) in RCA: 229] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 01/09/2015] [Indexed: 01/08/2023] Open
Abstract
The observation that disease associated proteins often interact with each other has fueled the development of network-based approaches to elucidate the molecular mechanisms of human disease. Such approaches build on the assumption that protein interaction networks can be viewed as maps in which diseases can be identified with localized perturbation within a certain neighborhood. The identification of these neighborhoods, or disease modules, is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While numerous heuristic methods exist that successfully pinpoint disease associated modules, the basic underlying connectivity patterns remain largely unexplored. In this work we aim to fill this gap by analyzing the network properties of a comprehensive corpus of 70 complex diseases. We find that disease associated proteins do not reside within locally dense communities and instead identify connectivity significance as the most predictive quantity. This quantity inspires the design of a novel Disease Module Detection (DIAMOnD) algorithm to identify the full disease module around a set of known disease proteins. We study the performance of the algorithm using well-controlled synthetic data and systematically validate the identified neighborhoods for a large corpus of diseases. Diseases are rarely the result of an abnormality in a single gene, but involve a whole cascade of interactions between several cellular processes. To disentangle these complex interactions it is necessary to study genotype-phenotype relationships in the context of protein-protein interaction networks. Our analysis of 70 diseases shows that disease proteins are not randomly scattered within these networks, but agglomerate in specific regions, suggesting the existence of specific disease modules for each disease. The identification of these modules is the first step towards elucidating the biological mechanisms of a disease or for a targeted search of drug targets. We present a systematic analysis of the connectivity patterns of disease proteins and determine the most predictive topological property for their identification. This allows us to rationally design a reliable and efficient Disease Module Detection algorithm (DIAMOnD).
Collapse
Affiliation(s)
- Susan Dina Ghiassian
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, Massachusetts, United States of America
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - Jörg Menche
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, Massachusetts, United States of America
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Center for Network Science, Central European University, Budapest, Hungary
| | - Albert-László Barabási
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, Massachusetts, United States of America
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Center for Network Science, Central European University, Budapest, Hungary
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
157
|
So J, Pasculescu A, Dai AY, Williton K, James A, Nguyen V, Creixell P, Schoof EM, Sinclair J, Barrios-Rodiles M, Gu J, Krizus A, Williams R, Olhovsky M, Dennis JW, Wrana JL, Linding R, Jorgensen C, Pawson T, Colwill K. Integrative analysis of kinase networks in TRAIL-induced apoptosis provides a source of potential targets for combination therapy. Sci Signal 2015; 8:rs3. [PMID: 25852190 DOI: 10.1126/scisignal.2005700] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2024]
Abstract
Tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) is an endogenous secreted peptide and, in preclinical studies, preferentially induces apoptosis in tumor cells rather than in normal cells. The acquisition of resistance in cells exposed to TRAIL or its mimics limits their clinical efficacy. Because kinases are intimately involved in the regulation of apoptosis, we systematically characterized kinases involved in TRAIL signaling. Using RNA interference (RNAi) loss-of-function and cDNA overexpression screens, we identified 169 protein kinases that influenced the dynamics of TRAIL-induced apoptosis in the colon adenocarcinoma cell line DLD-1. We classified the kinases as sensitizers or resistors or modulators, depending on the effect that knockdown and overexpression had on TRAIL-induced apoptosis. Two of these kinases that were classified as resistors were PX domain-containing serine/threonine kinase (PXK) and AP2-associated kinase 1 (AAK1), which promote receptor endocytosis and may enable cells to resist TRAIL-induced apoptosis by enhancing endocytosis of the TRAIL receptors. We assembled protein interaction maps using mass spectrometry-based protein interaction analysis and quantitative phosphoproteomics. With these protein interaction maps, we modeled information flow through the networks and identified apoptosis-modifying kinases that are highly connected to regulated substrates downstream of TRAIL. The results of this analysis provide a resource of potential targets for the development of TRAIL combination therapies to selectively kill cancer cells.
Collapse
Affiliation(s)
- Jonathan So
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada. Institute of Medical Science, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Adrian Pasculescu
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Anna Y Dai
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Kelly Williton
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Andrew James
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Vivian Nguyen
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Pau Creixell
- Cellular Signal Integration Group (C-SIG), Technical University of Denmark (DTU), DK-2800 Lyngby, Denmark
| | - Erwin M Schoof
- Cellular Signal Integration Group (C-SIG), Technical University of Denmark (DTU), DK-2800 Lyngby, Denmark
| | - John Sinclair
- Cell Communication Team, The Institute of Cancer Research, London SW3 6JB, UK
| | - Miriam Barrios-Rodiles
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Jun Gu
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Aldis Krizus
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Ryan Williams
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Marina Olhovsky
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - James W Dennis
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada. Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Jeffrey L Wrana
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada. Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Rune Linding
- Cellular Signal Integration Group (C-SIG), Technical University of Denmark (DTU), DK-2800 Lyngby, Denmark. Biotech Research and Innovation Centre (BRIC), University of Copenhagen (UCPH), DK-2200 Copenhagen, Denmark.
| | - Claus Jorgensen
- Cell Communication Team, The Institute of Cancer Research, London SW3 6JB, UK.
| | - Tony Pawson
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada. Institute of Medical Science, University of Toronto, Toronto, Ontario M5S 1A8, Canada. Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Karen Colwill
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada.
| |
Collapse
|
158
|
Li J, Wang L, Guo M, Zhang R, Dai Q, Liu X, Wang C, Teng Z, Xuan P, Zhang M. Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information. FEBS Open Bio 2015; 5:251-6. [PMID: 25870785 PMCID: PMC4392065 DOI: 10.1016/j.fob.2015.03.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Revised: 03/19/2015] [Accepted: 03/24/2015] [Indexed: 01/24/2023] Open
Abstract
An eQTL-based gene–gene co-regulation network was constructed. We adopted a random walk with restart (RWR) algorithm to mine for Alzheimer-disease related genes. The integrated HPRD PPI and GGCRN network had faster convergence than using HPRD PPI alone. The integrated network also revealed new disease-related genes.
In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein–protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene–gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Collapse
Affiliation(s)
- Jin Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China ; School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China ; College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Limei Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China ; School of Basic Medical Sciences, Harbin Medical University, Harbin, Heilongjiang, China
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Ruijie Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Qiguo Dai
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Zhixia Teng
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Ping Xuan
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
159
|
Prediction of cancer proteins by integrating protein interaction, domain frequency, and domain interaction data using machine learning algorithms. BIOMED RESEARCH INTERNATIONAL 2015; 2015:312047. [PMID: 25866773 PMCID: PMC4381656 DOI: 10.1155/2015/312047] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Revised: 02/25/2015] [Accepted: 03/03/2015] [Indexed: 12/23/2022]
Abstract
Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues's method by employing the protein-protein interaction (PPI) data, domain-domain interaction (DDI) data, weighted domain frequency score (DFS), and cancer linker degree (CLD) data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I) using individual algorithm, (II) combining algorithms, and (III) combining the same classification types of algorithms. When compared with Aragues's method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.
Collapse
|
160
|
Schuette S, Piatkowski B, Corley A, Lang D, Geisler M. Predicted protein-protein interactions in the moss Physcomitrella patens: a new bioinformatic resource. BMC Bioinformatics 2015; 16:89. [PMID: 25885037 PMCID: PMC4384322 DOI: 10.1186/s12859-015-0524-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 03/02/2015] [Indexed: 12/11/2022] Open
Abstract
Background Physcomitrella patens, a haploid dominant plant, is fast becoming a useful molecular genetics and bioinformatics tool due to its key phylogenetic position as a bryophyte in the post-genomic era. Genome sequences from select reference species were compared bioinformatically to Physcomitrella patens using reciprocal blasts with the InParanoid software package. A reference protein interaction database assembled using MySQL by compiling BioGrid, BIND, DIP, and Intact databases was queried for moss orthologs existing for both interacting partners. This method has been used to successfully predict interactions for a number of angiosperm plants. Results The first predicted protein-protein interactome for a bryophyte based on the interolog method contains 67,740 unique interactions from 5,695 different Physcomitrella patens proteins. Most conserved interactions among proteins were those associated with metabolic processes. Over-represented Gene Ontology categories are reported here. Conclusion Addition of moss, a plant representative 200 million years diverged from angiosperms to interactomic research greatly expands the possibility of conducting comparative analyses giving tremendous insight into network evolution of land plants. This work helps demonstrate the utility of “guilt-by-association” models for predicting protein interactions, providing provisional roadmaps that can be explored using experimental approaches. Included with this dataset is a method for characterizing subnetworks and investigating specific processes, such as the Calvin-Benson-Bassham cycle. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0524-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Scott Schuette
- Department of Plant Biology, Southern Illinois University, Carbondale, IL, USA.
| | - Brian Piatkowski
- Department of Plant Biology, Southern Illinois University, Carbondale, IL, USA.
| | - Aaron Corley
- Department of Plant Biology, Southern Illinois University, Carbondale, IL, USA.
| | - Daniel Lang
- University of Freiburg, Plant Biotechnology Schaenzlestr. 1, D-79104, Freiburg, Germany.
| | - Matt Geisler
- Department of Plant Biology, Southern Illinois University, Carbondale, IL, USA.
| |
Collapse
|
161
|
Bertelsen B, Melchior L, Jensen LR, Groth C, Nazaryan L, Debes NM, Skov L, Xie G, Sun W, Brøndum-Nielsen K, Kuss AW, Chen W, Tümer Z. A t(3;9)(q25.1;q34.3) translocation leading to OLFM1 fusion transcripts in Gilles de la Tourette syndrome, OCD and ADHD. Psychiatry Res 2015; 225:268-75. [PMID: 25595337 DOI: 10.1016/j.psychres.2014.12.028] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 12/08/2014] [Accepted: 12/18/2014] [Indexed: 01/13/2023]
Abstract
Gilles de la Tourette syndrome (GTS) is a neuropsychiatric disorder with a strong genetic etiology; however, finding of candidate genes is hampered by its genetic heterogeneity and the influence of non-genetic factors on disease pathogenesis. We report a case of a male patient with GTS, obsessive compulsive disorder, attention-deficit/hyperactivity-disorder, as well as other comorbidities, and a translocation t(3;9)(q25.1;q34.3) inherited from a mother with tics. Mate-pair sequencing revealed that the translocation breakpoints truncated the olfactomedin 1 (OLFM1) gene and two uncharacterized transcripts. Reverse-transcription PCR identified several fusion transcripts in the carriers, and OLFM1 expression was found to be high in GTS-related human brain regions. As OLFM1 plays a role in neuronal development it is a likely candidate gene for neuropsychiatric disorders and haploinsufficiency of OLFM1 could be a contributing risk factor to the phenotype of the carriers. In addition, one of the fusion transcripts may exert a dominant-negative or gain-of-function effect. OLFM1 is unlikely to be a major GTS susceptibility gene as no point mutations or copy number variants affecting OLFM1 were identified in 175 additional patients. The translocation described is thus a unique event, but further studies in larger cohorts are required to elucidate involvement of OLFM1 in GTS pathogenesis.
Collapse
Affiliation(s)
- Birgitte Bertelsen
- Department of Clinical Genetics, Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, Glostrup, Denmark
| | - Linea Melchior
- Department of Clinical Genetics, Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, Glostrup, Denmark
| | - Lars Riff Jensen
- Department of Human Genetics, University Medicine Greifswald and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Greifswald, Germany
| | - Camilla Groth
- Tourette Clinic, Department of Pediatrics, Copenhagen University Hospital, Herlev Hospital, Herlev, Denmark
| | - Lusine Nazaryan
- Department of Clinical Genetics, Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, Glostrup, Denmark
| | - Nanette Mol Debes
- Tourette Clinic, Department of Pediatrics, Copenhagen University Hospital, Herlev Hospital, Herlev, Denmark
| | - Liselotte Skov
- Tourette Clinic, Department of Pediatrics, Copenhagen University Hospital, Herlev Hospital, Herlev, Denmark
| | - Gangcai Xie
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Wei Sun
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Karen Brøndum-Nielsen
- Department of Clinical Genetics, Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, Glostrup, Denmark
| | - Andreas Walter Kuss
- Department of Human Genetics, University Medicine Greifswald and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Greifswald, Germany
| | - Wei Chen
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Zeynep Tümer
- Department of Clinical Genetics, Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, Glostrup, Denmark.
| |
Collapse
|
162
|
Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M, Loscalzo J, Barabási AL. Disease networks. Uncovering disease-disease relationships through the incomplete interactome. Science 2015; 347:1257601. [PMID: 25700523 PMCID: PMC4435741 DOI: 10.1126/science.1257601] [Citation(s) in RCA: 891] [Impact Index Per Article: 99.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
According to the disease module hypothesis, the cellular components associated with a disease segregate in the same neighborhood of the human interactome, the map of biologically relevant molecular interactions. Yet, given the incompleteness of the interactome and the limited knowledge of disease-associated genes, it is not obvious if the available data have sufficient coverage to map out modules associated with each disease. Here we derive mathematical conditions for the identifiability of disease modules and show that the network-based location of each disease module determines its pathobiological relationship to other diseases. For example, diseases with overlapping network modules show significant coexpression patterns, symptom similarity, and comorbidity, whereas diseases residing in separated network neighborhoods are phenotypically distinct. These tools represent an interactome-based platform to predict molecular commonalities between phenotypically related diseases, even if they do not share primary disease genes.
Collapse
Affiliation(s)
- Jörg Menche
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA. Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary
| | - Amitabh Sharma
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Maksim Kitsak
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Susan Dina Ghiassian
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA. Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | - Albert-László Barabási
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA. Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary. Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA.
| |
Collapse
|
163
|
Wu S, Shao F, Ji J, Sun R, Dong R, Zhou Y, Xu S, Sui Y, Hu J. Network propagation with dual flow for gene prioritization. PLoS One 2015; 10:e0116505. [PMID: 25689268 PMCID: PMC4331530 DOI: 10.1371/journal.pone.0116505] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 11/24/2014] [Indexed: 12/31/2022] Open
Abstract
Based on the hypothesis that the neighbors of disease genes trend to cause similar diseases, network-based methods for disease prediction have received increasing attention. Taking full advantage of network structure, the performance of global distance measurements is generally superior to local distance measurements. However, some problems exist in the global distance measurements. For example, global distance measurements may mistake non-disease hub proteins that have dense interactions with known disease proteins for potential disease proteins. To find a new method to avoid the aforementioned problem, we analyzed the differences between disease proteins and other proteins by using essential proteins (proteins encoded by essential genes) as references. We find that disease proteins are not well connected with essential proteins in the protein interaction networks. Based on this new finding, we proposed a novel strategy for gene prioritization based on protein interaction networks. We allocated positive flow to disease genes and negative flow to essential genes, and adopted network propagation for gene prioritization. Experimental results on 110 diseases verified the effectiveness and potential of the proposed method.
Collapse
Affiliation(s)
- Shunyao Wu
- College of Automation Engineering, Qingdao University, Qingdao, China
- College of Information Engineering, Qingdao University, Qingdao, China
| | - Fengjing Shao
- College of Automation Engineering, Qingdao University, Qingdao, China
- College of Information Engineering, Qingdao University, Qingdao, China
- * E-mail:
| | - Jun Ji
- College of Information Engineering, Qingdao University, Qingdao, China
| | - Rencheng Sun
- College of Information Engineering, Qingdao University, Qingdao, China
| | - Rizhuang Dong
- School of Computer Engineering, Qingdao Technological University, Qingdao, China
| | - Yuanke Zhou
- College of Information Engineering, Qingdao University, Qingdao, China
| | - Shaojie Xu
- College of Information Engineering, Qingdao University, Qingdao, China
| | - Yi Sui
- College of Information Engineering, Qingdao University, Qingdao, China
| | - Jianlong Hu
- College of Information Engineering, Qingdao University, Qingdao, China
| |
Collapse
|
164
|
Zhao ZQ, Han GS, Yu ZG, Li J. Laplacian normalization and random walk on heterogeneous networks for disease-gene prioritization. Comput Biol Chem 2015; 57:21-8. [PMID: 25736609 DOI: 10.1016/j.compbiolchem.2015.02.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Accepted: 02/03/2015] [Indexed: 12/11/2022]
Abstract
Random walk on heterogeneous networks is a recently emerging approach to effective disease gene prioritization. Laplacian normalization is a technique capable of normalizing the weight of edges in a network. We use this technique to normalize the gene matrix and the phenotype matrix before the construction of the heterogeneous network, and also use this idea to define the transition matrices of the heterogeneous network. Our method has remarkably better performance than the existing methods for recovering known gene-phenotype relationships. The Shannon information entropy of the distribution of the transition probabilities in our networks is found to be smaller than the networks constructed by the existing methods, implying that a higher number of top-ranked genes can be verified as disease genes. In fact, the most probable gene-phenotype relationships ranked within top 3 or top 5 in our gene lists can be confirmed by the OMIM database for many cases. Our algorithms have shown remarkably superior performance over the state-of-the-art algorithms for recovering gene-phenotype relationships. All Matlab codes can be available upon email request.
Collapse
Affiliation(s)
- Zhi-Qin Zhao
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Guo-Sheng Han
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Zu-Guo Yu
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China; School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane Q4001, Australia.
| | - Jinyan Li
- Advanced Analytics Institute & Centre for Health Technologies, University of Technology Sydney, Broadway, NSW 2007, Australia.
| |
Collapse
|
165
|
Wu L, Shen Y, Li M, Wu FX. Network output controllability-based method for drug target identification. IEEE Trans Nanobioscience 2015; 14:184-91. [PMID: 25643411 DOI: 10.1109/tnb.2015.2391175] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Biomolecules do not perform their functions alone, but interactively with one another to form so called biomolecular networks. It is well known that a complex disease stems from the malfunctions of corresponding biomolecular networks. Therefore, one of important tasks is to identify drug targets from biomolecular networks. In this study, the drug target identification is formulated as a problem of finding steering nodes in biomolecular networks while the concept of network output controllability is applied to the problem of drug target identification. By applying control signals to these steering nodes, the biomolecular networks are expected to be transited from one state to another. A graph-theoretic algorithm has been proposed to find a minimum set of steering nodes in biomolecular networks which can be a potential set of drug targets. Application results of the method to real biomolecular networks show that identified potential drug targets are in agreement with existing research results. This indicates that the method can generate testable predictions and provide insights into experimental design of drug discovery.
Collapse
|
166
|
Jin K, Musso G, Vlasblom J, Jessulat M, Deineko V, Negroni J, Mosca R, Malty R, Nguyen-Tran DH, Aoki H, Minic Z, Freywald T, Phanse S, Xiang Q, Freywald A, Aloy P, Zhang Z, Babu M. Yeast Mitochondrial Protein–Protein Interactions Reveal Diverse Complexes and Disease-Relevant Functional Relationships. J Proteome Res 2015; 14:1220-37. [DOI: 10.1021/pr501148q] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Ke Jin
- Terrence
Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Gabriel Musso
- Cardiovascular
Division, Brigham and Women’s Hospital, Boston, Massachusetts 02115, United States
- Department
of Medicine, Harvard Medical School, Boston, Massachusetts 02115, United States
| | - James Vlasblom
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Matthew Jessulat
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Viktor Deineko
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Jacopo Negroni
- Joint
IRB−BSC Program in Computational Biology, IRB, Barcelona 08028, Spain
| | - Roberto Mosca
- Joint
IRB−BSC Program in Computational Biology, IRB, Barcelona 08028, Spain
| | - Ramy Malty
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Diem-Hang Nguyen-Tran
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Hiroyuki Aoki
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Zoran Minic
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Tanya Freywald
- Cancer Research
Unit, Saskatchewan Cancer Agency, Saskatoon, Saskatchewan S7N 5E5, Canada
| | - Sadhna Phanse
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Qian Xiang
- Terrence
Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Andrew Freywald
- Cancer Research
Unit, Saskatchewan Cancer Agency, Saskatoon, Saskatchewan S7N 5E5, Canada
| | - Patrick Aloy
- Joint
IRB−BSC Program in Computational Biology, IRB, Barcelona 08028, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona 08010, Spain
| | - Zhaolei Zhang
- Terrence
Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Mohan Babu
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| |
Collapse
|
167
|
Engchuan W, Dhindsa K, Lionel AC, Scherer SW, Chan JH, Merico D. Performance of case-control rare copy number variation annotation in classification of autism. BMC Med Genomics 2015; 8 Suppl 1:S7. [PMID: 25783485 PMCID: PMC4315323 DOI: 10.1186/1755-8794-8-s1-s7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND A substantial proportion of Autism Spectrum Disorder (ASD) risk resides in de novo germline and rare inherited genetic variation. In particular, rare copy number variation (CNV) contributes to ASD risk in up to 10% of ASD subjects. Despite the striking degree of genetic heterogeneity, case-control studies have detected specific burden of rare disruptive CNV for neuronal and neurodevelopmental pathways. Here, we used machine learning methods to classify ASD subjects and controls, based on rare CNV data and comprehensive gene annotations. We investigated performance of different methods and estimated the percentage of ASD subjects that could be reliably classified based on presumed etiologic CNV they carry. RESULTS We analyzed 1,892 Caucasian ASD subjects and 2,342 matched controls. Rare CNVs (frequency 1% or less) were detected using Illumina 1M and 1M-Duo BeadChips. Conditional Inference Forest (CF) typically performed as well as or better than other classification methods. We found a maximum AUC (area under the ROC curve) of 0.533 when considering all ASD subjects with rare genic CNVs, corresponding to 7.9% correctly classified ASD subjects and less than 3% incorrectly classified controls; performance was significantly higher when considering only subjects harboring de novo or pathogenic CNVs. We also found rare losses to be more predictive than gains and that curated neurally-relevant annotations (brain expression, synaptic components and neurodevelopmental phenotypes) outperform Gene Ontology and pathway-based annotations. CONCLUSIONS CF is an optimal classification approach for case-control rare CNV data and it can be used to prioritize subjects with variants potentially contributing to ASD risk not yet recognized. The neurally-relevant annotations used in this study could be successfully applied to rare CNV case-control data-sets for other neuropsychiatric disorders.
Collapse
|
168
|
Sharma A, Menche J, Huang CC, Ort T, Zhou X, Kitsak M, Sahni N, Thibault D, Voung L, Guo F, Ghiassian SD, Gulbahce N, Baribaud F, Tocker J, Dobrin R, Barnathan E, Liu H, Panettieri RA, Tantisira KG, Qiu W, Raby BA, Silverman EK, Vidal M, Weiss ST, Barabási AL. A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma. Hum Mol Genet 2015; 24:3005-20. [PMID: 25586491 DOI: 10.1093/hmg/ddv001] [Citation(s) in RCA: 120] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 01/05/2015] [Indexed: 01/24/2023] Open
Abstract
Recent advances in genetics have spurred rapid progress towards the systematic identification of genes involved in complex diseases. Still, the detailed understanding of the molecular and physiological mechanisms through which these genes affect disease phenotypes remains a major challenge. Here, we identify the asthma disease module, i.e. the local neighborhood of the interactome whose perturbation is associated with asthma, and validate it for functional and pathophysiological relevance, using both computational and experimental approaches. We find that the asthma disease module is enriched with modest GWAS P-values against the background of random variation, and with differentially expressed genes from normal and asthmatic fibroblast cells treated with an asthma-specific drug. The asthma module also contains immune response mechanisms that are shared with other immune-related disease modules. Further, using diverse omics (genomics, gene-expression, drug response) data, we identify the GAB1 signaling pathway as an important novel modulator in asthma. The wiring diagram of the uncovered asthma module suggests a relatively close link between GAB1 and glucocorticoids (GCs), which we experimentally validate, observing an increase in the level of GAB1 after GC treatment in BEAS-2B bronchial epithelial cells. The siRNA knockdown of GAB1 in the BEAS-2B cell line resulted in a decrease in the NFkB level, suggesting a novel regulatory path of the pro-inflammatory factor NFkB by GAB1 in asthma.
Collapse
Affiliation(s)
- Amitabh Sharma
- Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Jörg Menche
- Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA Department of Theoretical Physics, Budapest University of Technology and Economics, H1111, Budapest, Hungary Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary
| | - C Chris Huang
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Tatiana Ort
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Xiaobo Zhou
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Maksim Kitsak
- Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Nidhi Sahni
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Derek Thibault
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Linh Voung
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Feng Guo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Susan Dina Ghiassian
- Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Natali Gulbahce
- Department of Cellular and Molecular Pharmacology, University of California 1700, 4th Street, Byers Hall 308D, San Francisco, CA 94158, USA
| | - Frédéric Baribaud
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Joel Tocker
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Radu Dobrin
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Elliot Barnathan
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Hao Liu
- Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA
| | - Reynold A Panettieri
- Pulmonary Allergy and Critical Care Division, Department of Medicine, University of Pennsylvania, 125 South 31st Street, TRL Suite 1200, Philadelphia, PA 19104, USA
| | - Kelan G Tantisira
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Weiliang Qiu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Benjamin A Raby
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Albert-László Barabási
- Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA Department of Theoretical Physics, Budapest University of Technology and Economics, H1111, Budapest, Hungary Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary
| |
Collapse
|
169
|
Chen H, Zhu Z, Zhu Y, Wang J, Mei Y, Cheng Y. Pathway mapping and development of disease-specific biomarkers: protein-based network biomarkers. J Cell Mol Med 2015; 19:297-314. [PMID: 25560835 PMCID: PMC4407592 DOI: 10.1111/jcmm.12447] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Accepted: 08/22/2014] [Indexed: 01/06/2023] Open
Abstract
It is known that a disease is rarely a consequence of an abnormality of a single gene, but reflects the interactions of various processes in a complex network. Annotated molecular networks offer new opportunities to understand diseases within a systems biology framework and provide an excellent substrate for network-based identification of biomarkers. The network biomarkers and dynamic network biomarkers (DNBs) represent new types of biomarkers with protein-protein or gene-gene interactions that can be monitored and evaluated at different stages and time-points during development of disease. Clinical bioinformatics as a new way to combine clinical measurements and signs with human tissue-generated bioinformatics is crucial to translate biomarkers into clinical application, validate the disease specificity, and understand the role of biomarkers in clinical settings. In this article, the recent advances and developments on network biomarkers and DNBs are comprehensively reviewed. How network biomarkers help a better understanding of molecular mechanism of diseases, the advantages and constraints of network biomarkers for clinical application, clinical bioinformatics as a bridge to the development of diseases-specific, stage-specific, severity-specific and therapy predictive biomarkers, and the potentials of network biomarkers are also discussed.
Collapse
Affiliation(s)
- Hao Chen
- Department of Cardiothoracic Surgery, Tongji Hospital, Tongji University, Shanghai, China
| | | | | | | | | | | |
Collapse
|
170
|
Nie W, Lv Y, Yan L, Guan T, Li Q, Guo X, Liu W, Feng M, Xu G, Chen X, Lv H. Discovery and characterization of functional modules and pathogenic genes associated with the risk of coronary artery disease. RSC Adv 2015. [DOI: 10.1039/c5ra01920f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
An integrated network biology approach for identifying disease risk functional modules and risk pathogenic genes for associated with CAD risk.
Collapse
|
171
|
Qin G, Zhao XM. A survey on computational approaches to identifying disease biomarkers based on molecular networks. J Theor Biol 2014; 362:9-16. [DOI: 10.1016/j.jtbi.2014.06.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Revised: 06/03/2014] [Accepted: 06/04/2014] [Indexed: 11/29/2022]
|
172
|
Gomez-Cabrero D, Menche J, Cano I, Abugessaisa I, Huertas-Migueláñez M, Tenyi A, Marin de Mas I, Kiani NA, Marabita F, Falciani F, Burrowes K, Maier D, Wagner P, Selivanov V, Cascante M, Roca J, Barabási AL, Tegnér J. Systems Medicine: from molecular features and models to the clinic in COPD. J Transl Med 2014; 12 Suppl 2:S4. [PMID: 25471042 PMCID: PMC4255907 DOI: 10.1186/1479-5876-12-s2-s4] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background and hypothesis Chronic Obstructive Pulmonary Disease (COPD) patients are characterized by heterogeneous clinical manifestations and patterns of disease progression. Two major factors that can be used to identify COPD subtypes are muscle dysfunction/wasting and co-morbidity patterns. We hypothesized that COPD heterogeneity is in part the result of complex interactions between several genes and pathways. We explored the possibility of using a Systems Medicine approach to identify such pathways, as well as to generate predictive computational models that may be used in clinic practice. Objective and method Our overarching goal is to generate clinically applicable predictive models that characterize COPD heterogeneity through a Systems Medicine approach. To this end we have developed a general framework, consisting of three steps/objectives: (1) feature identification, (2) model generation and statistical validation, and (3) application and validation of the predictive models in the clinical scenario. We used muscle dysfunction and co-morbidity as test cases for this framework. Results In the study of muscle wasting we identified relevant features (genes) by a network analysis and generated predictive models that integrate mechanistic and probabilistic models. This allowed us to characterize muscle wasting as a general de-regulation of pathway interactions. In the co-morbidity analysis we identified relevant features (genes/pathways) by the integration of gene-disease and disease-disease associations. We further present a detailed characterization of co-morbidities in COPD patients that was implemented into a predictive model. In both use cases we were able to achieve predictive modeling but we also identified several key challenges, the most pressing being the validation and implementation into actual clinical practice. Conclusions The results confirm the potential of the Systems Medicine approach to study complex diseases and generate clinically relevant predictive models. Our study also highlights important obstacles and bottlenecks for such approaches (e.g. data availability and normalization of frameworks among others) and suggests specific proposals to overcome them.
Collapse
|
173
|
Luo X, Huang L, Han L, Luo Z, Hu F, Tieu R, Gan L. Systematic prioritization and integrative analysis of copy number variations in schizophrenia reveal key schizophrenia susceptibility genes. Schizophr Bull 2014; 40:1285-99. [PMID: 24664977 PMCID: PMC4193716 DOI: 10.1093/schbul/sbu045] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Schizophrenia is a common mental disorder with high heritability and strong genetic heterogeneity. Common disease-common variants hypothesis predicts that schizophrenia is attributable in part to common genetic variants. However, recent studies have clearly demonstrated that copy number variations (CNVs) also play pivotal roles in schizophrenia susceptibility and explain a proportion of missing heritability. Though numerous CNVs have been identified, many of the regions affected by CNVs show poor overlapping among different studies, and it is not known whether the genes disrupted by CNVs contribute to the risk of schizophrenia. By using cumulative scoring, we systematically prioritized the genes affected by CNVs in schizophrenia. We identified 8 top genes that are frequently disrupted by CNVs, including NRXN1, CHRNA7, BCL9, CYFIP1, GJA8, NDE1, SNAP29, and GJA5. Integration of genes affected by CNVs with known schizophrenia susceptibility genes (from previous genetic linkage and association studies) reveals that many genes disrupted by CNVs are also associated with schizophrenia. Further protein-protein interaction (PPI) analysis indicates that protein products of genes affected by CNVs frequently interact with known schizophrenia-associated proteins. Finally, systematic integration of CNVs prioritization data with genetic association and PPI data identifies key schizophrenia candidate genes. Our results provide a global overview of genes impacted by CNVs in schizophrenia and reveal a densely interconnected molecular network of de novo CNVs in schizophrenia. Though the prioritized top genes represent promising schizophrenia risk genes, further work with different prioritization methods and independent samples is needed to confirm these findings. Nevertheless, the identified key candidate genes may have important roles in the pathogenesis of schizophrenia, and further functional characterization of these genes may provide pivotal targets for future therapeutics and diagnostics.
Collapse
Affiliation(s)
- Xiongjian Luo
- Flaum Eye Institute and Department of Ophthalmology, University of Rochester, Rochester, NY; College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China; State Key Laboratory of Genetic Engineering and Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China;
| | - Liang Huang
- First Affiliated Hospital of Gannan Medical University, Ganzhou, Jiangxi, China;,Affiliated Eye Hospital of Nanchang University, Nanchang, Jiangxi, China;,These authors contributed equally to the article
| | - Leng Han
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX;,These authors contributed equally to the article
| | - Zhenwu Luo
- Wuhan Institute of Virology, Chinese Academy of Sciences, WuChang, Wuhan, China
| | - Fang Hu
- First Affiliated Hospital of Gannan Medical University, Ganzhou, Jiangxi, China;,Affiliated Eye Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Roger Tieu
- Department of Biochemistry, Emory University, Atlanta, GA
| | - Lin Gan
- Flaum Eye Institute and Department of Ophthalmology, University of Rochester, Rochester, NY;,College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China
| |
Collapse
|
174
|
Xu W, Jiang X, Hu X, Li G. Visualization of genetic disease-phenotype similarities by multiple maps t-SNE with Laplacian regularization. BMC Med Genomics 2014; 7 Suppl 2:S1. [PMID: 25350393 PMCID: PMC4243097 DOI: 10.1186/1755-8794-7-s2-s1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND From a phenotypic standpoint, certain types of diseases may prove to be difficult to accurately diagnose, due to specific combinations of confounding symptoms. Referred to as phenotypic overlap, these sets of disease-related symptoms suggest shared pathophysiological mechanisms. Few attempts have been made to visualize the phenotypic relationships between different human diseases from a machine learning perspective. The proposed research, it is anticipated, will visually assist researchers in quickly disambiguating symptoms which can confound the timely and accurate diagnosis of a disease. METHODS Our method is primarily based on multiple maps t-SNE (mm-tSNE), which is a probabilistic method for visualizing data points in multiple low dimensional spaces. We improved mm-tSNE by adding a Laplacian regularization term and subsequently provide an algorithm for optimizing the new objective function. The advantage of Laplacian regularization is that it adopts clustering structures of variables and provides more sparsity to the estimated parameters. RESULTS In order to further assess our modified mm-tSNE algorithm from a comparative standpoint, we reexamined two social network datasets used by the previous authors. Subsequently, we apply our method on phenotype dataset. In all these cases, our proposed method demonstrated better performance than the original version of mm-tSNE, as measured by the neighbourhood preservation ratio. CONCLUSIONS Phenotype grouping reflects the nature of human disease genetics. Thus, phenotype visualization may be complementary to investigate candidate genes for diseases as well as functional relations between genes and proteins. These relationships can be modelled by the modified mm-tSNE method. The modified mm-tSNE can be applied directly in other domain including social and biological datasets.
Collapse
|
175
|
Li M, Zhang J, Liu Q, Wang J, Wu FX. Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation. BMC Med Genomics 2014; 7 Suppl 2:S4. [PMID: 25350763 PMCID: PMC4243158 DOI: 10.1186/1755-8794-7-s2-s4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Predicting disease-related genes is one of the most important tasks in bioinformatics and systems biology. With the advances in high-throughput techniques, a large number of protein-protein interactions are available, which make it possible to identify disease-related genes at the network level. However, network-based identification of disease-related genes is still a challenge as the considerable false-positives are still existed in the current available protein interaction networks (PIN). RESULTS Considering the fact that the majority of genetic disorders tend to manifest only in a single or a few tissues, we constructed tissue-specific networks (TSN) by integrating PIN and tissue-specific data. We further weighed the constructed tissue-specific network (WTSN) by using DNA methylation as it plays an irreplaceable role in the development of complex diseases. A PageRank-based method was developed to identify disease-related genes from the constructed networks. To validate the effectiveness of the proposed method, we constructed PIN, weighted PIN (WPIN), TSN, WTSN for colon cancer and leukemia, respectively. The experimental results on colon cancer and leukemia show that the combination of tissue-specific data and DNA methylation can help to identify disease-related genes more accurately. Moreover, the PageRank-based method was effective to predict disease-related genes on the case studies of colon cancer and leukemia. CONCLUSIONS Tissue-specific data and DNA methylation are two important factors to the study of human diseases. The same method implemented on the WTSN can achieve better results compared to those being implemented on original PIN, WPIN, or TSN. The PageRank-based method outperforms degree centrality-based method for identifying disease-related genes from WTSN.
Collapse
Affiliation(s)
- Min Li
- School of Information Science and Engineering, Central South University, Changsha 410083, Hunan, P. R. China
| | - Jiayi Zhang
- School of Information Science and Engineering, Central South University, Changsha 410083, Hunan, P. R. China
| | - Qing Liu
- School of Information Science and Engineering, Central South University, Changsha 410083, Hunan, P. R. China
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha 410083, Hunan, P. R. China
| | - Fang-Xiang Wu
- School of Information Science and Engineering, Central South University, Changsha 410083, Hunan, P. R. China
- College of Engineering, University of Saskatchewan, 57 Campus Dr., Saskatoon, SK Canada
| |
Collapse
|
176
|
Chen B, Wang J, Li M, Wu FX. Identifying disease genes by integrating multiple data sources. BMC Med Genomics 2014; 7 Suppl 2:S2. [PMID: 25350511 PMCID: PMC4243092 DOI: 10.1186/1755-8794-7-s2-s2] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes. RESULTS In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. CONCLUSIONS Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments.
Collapse
|
177
|
Guo NL, Wan YW. Network-based identification of biomarkers coexpressed with multiple pathways. Cancer Inform 2014; 13:37-47. [PMID: 25392692 PMCID: PMC4218687 DOI: 10.4137/cin.s14054] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Revised: 06/25/2014] [Accepted: 06/29/2014] [Indexed: 02/07/2023] Open
Abstract
Unraveling complex molecular interactions and networks and incorporating clinical information in modeling will present a paradigm shift in molecular medicine. Embedding biological relevance via modeling molecular networks and pathways has become increasingly important for biomarker identification in cancer susceptibility and metastasis studies. Here, we give a comprehensive overview of computational methods used for biomarker identification, and provide a performance comparison of several network models used in studies of cancer susceptibility, disease progression, and prognostication. Specifically, we evaluated implication networks, Boolean networks, Bayesian networks, and Pearson’s correlation networks in constructing gene coexpression networks for identifying lung cancer diagnostic and prognostic biomarkers. The results show that implication networks, implemented in Genet package, identified sets of biomarkers that generated an accurate prediction of lung cancer risk and metastases; meanwhile, implication networks revealed more biologically relevant molecular interactions than Boolean networks, Bayesian networks, and Pearson’s correlation networks when evaluated with MSigDB database.
Collapse
Affiliation(s)
- Nancy Lan Guo
- Mary Babb Randolph Cancer Center/School of Public Health, West Virginia University, Morgantown, WV, USA
| | - Ying-Wooi Wan
- Mary Babb Randolph Cancer Center/School of Public Health, West Virginia University, Morgantown, WV, USA
| |
Collapse
|
178
|
Integration strategy is a key step in network-based analysis and dramatically affects network topological properties and inferring outcomes. BIOMED RESEARCH INTERNATIONAL 2014; 2014:296349. [PMID: 25243127 PMCID: PMC4163410 DOI: 10.1155/2014/296349] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Revised: 07/14/2014] [Accepted: 07/17/2014] [Indexed: 01/17/2023]
Abstract
An increasing number of experiments have been designed to detect intracellular and intercellular molecular interactions. Based on these molecular interactions (especially protein interactions), molecular networks have been built for using in several typical applications, such as the discovery of new disease genes and the identification of drug targets and molecular complexes. Because the data are incomplete and a considerable number of false-positive interactions exist, protein interactions from different sources are commonly integrated in network analyses to build a stable molecular network. Although various types of integration strategies are being applied in current studies, the topological properties of the networks from these different integration strategies, especially typical applications based on these network integration strategies, have not been rigorously evaluated. In this paper, systematic analyses were performed to evaluate 11 frequently used methods using two types of integration strategies: empirical and machine learning methods. The topological properties of the networks of these different integration strategies were found to significantly differ. Moreover, these networks were found to dramatically affect the outcomes of typical applications, such as disease gene predictions, drug target detections, and molecular complex identifications. The analysis presented in this paper could provide an important basis for future network-based biological researches.
Collapse
|
179
|
Smedley D, Köhler S, Czeschik JC, Amberger J, Bocchini C, Hamosh A, Veldboer J, Zemojtel T, Robinson PN. Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases. Bioinformatics 2014; 30:3215-22. [PMID: 25078397 PMCID: PMC4221119 DOI: 10.1093/bioinformatics/btu508] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Motivation: Whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. Results: Here, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring the variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. We implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation. Availability and implementation: http://compbio.charite.de/ExomeWalker Contact: peter.robinson@charite.de
Collapse
Affiliation(s)
- Damian Smedley
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Sebastian Köhler
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Johanna Christina Czeschik
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Joanna Amberger
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Carol Bocchini
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Ada Hamosh
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Julian Veldboer
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Tomasz Zemojtel
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Peter N Robinson
- Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany Mouse Informatics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Genome Informatics Department, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Hufelandstr. 55, 45122 Essen, Germany, McKusick-Nathans Institute of Genetic Medicine, John Hopkins University School of Medicine, Baltimore, MD 21205, USA, Department of Mathematics and Computer Science, Institute for Bioinformatics, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-701 Poznan, Poland, Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin and Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany Mouse Informatics Group, The Wellcome Trust Sang
| |
Collapse
|
180
|
Das J, Fragoza R, Lee HR, Cordero NA, Guo Y, Meyer MJ, Vo TV, Wang X, Yu H. Exploring mechanisms of human disease through structurally resolved protein interactome networks. MOLECULAR BIOSYSTEMS 2014; 10:9-17. [PMID: 24096645 DOI: 10.1039/c3mb70225a] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The study of the molecular basis of human disease has gained increasing attention over the past decade. With significant improvements in sequencing efficiency and throughput, a wealth of genotypic data has become available. However the translation of this information into concrete advances in diagnostic and clinical setups has proved far more challenging. Two major reasons for this are the lack of functional annotation for genomic variants and the complex nature of genotype-to-phenotype relationships. One fundamental approach to bypass these issues is to examine the effects of genetic variation at the level of proteins as they are directly involved in carrying out biological functions. Within the cell, proteins function by interacting with other proteins as a part of an underlying interactome network. This network can be determined using interactome mapping - a combination of high-throughput experimental toolkits and curation from small-scale studies. Integrating structural information from co-crystals with the network allows generation of a structurally resolved network. Within the context of this network, the structural principles of disease mutations can be examined and used to generate reliable mechanistic hypotheses regarding disease pathogenesis.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Hao Ran Lee
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Nicolas A Cordero
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Yu Guo
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA
| | - Tommy V Vo
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Xiujuan Wang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
181
|
Human symptoms-disease network. Nat Commun 2014; 5:4212. [PMID: 24967666 DOI: 10.1038/ncomms5212] [Citation(s) in RCA: 316] [Impact Index Per Article: 31.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Accepted: 05/27/2014] [Indexed: 12/19/2022] Open
Abstract
In the post-genomic era, the elucidation of the relationship between the molecular origins of diseases and their resulting phenotypes is a crucial task for medical research. Here, we use a large-scale biomedical literature database to construct a symptom-based human disease network and investigate the connection between clinical manifestations of diseases and their underlying molecular interactions. We find that the symptom-based similarity of two diseases correlates strongly with the number of shared genetic associations and the extent to which their associated proteins interact. Moreover, the diversity of the clinical manifestations of a disease can be related to the connectivity patterns of the underlying protein interaction network. The comprehensive, high-quality map of disease-symptom relations can further be used as a resource helping to address important questions in the field of systems medicine, for example, the identification of unexpected associations between diseases, disease etiology research or drug design.
Collapse
|
182
|
Caberlotto L, Nguyen TP. A systems biology investigation of neurodegenerative dementia reveals a pivotal role of autophagy. BMC SYSTEMS BIOLOGY 2014; 8:65. [PMID: 24908109 PMCID: PMC4077228 DOI: 10.1186/1752-0509-8-65] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Accepted: 05/20/2014] [Indexed: 11/25/2022]
Abstract
Background Neurodegenerative dementia comprises chronic and progressive illnesses with major clinical features represented by progressive and permanent loss of cognitive and mental performance, including impairment of memory and brain functions. Many different forms of neurodegenerative dementia exist, but they are all characterized by death of specific subpopulation of neurons and accumulation of proteins in the brain. We incorporated data from OMIM and primary molecular targets of drugs in the different phases of the drug discovery process to try to reveal possible hidden mechanism in neurodegenerative dementia. In the present study, a systems biology approach was used to investigate the molecular connections among seemingly distinct complex diseases with the shared clinical symptoms of dementia that could suggest related disease mechanisms. Results Network analysis was applied to characterize an interaction network of disease proteins and drug targets, revealing a major role of metabolism and, predominantly, of autophagy process in dementia and, particularly, in tauopathies. Different phases of the autophagy molecular pathway appear to be implicated in the individual disease pathophysiology and specific drug targets associated to autophagy modulation could be considered for pharmacological intervention. In particular, in view of their centrality and of the direct association to autophagy proteins in the network, PP2A subunits could be suggested as a suitable molecular target for the development of novel drugs. Conclusion The present systems biology investigation identifies the autophagy pathway as a central dis-regulated process in neurodegenerative dementia with a prevalent involvement in diseases characterized by tau inclusion and indicates the disease-specific molecules in the pathway that could be considered for therapy.
Collapse
Affiliation(s)
- Laura Caberlotto
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), Piazza Manifattura 1, 38068 Rovereto, Italy.
| | | |
Collapse
|
183
|
Lage K. Protein-protein interactions and genetic diseases: The interactome. Biochim Biophys Acta Mol Basis Dis 2014; 1842:1971-1980. [PMID: 24892209 DOI: 10.1016/j.bbadis.2014.05.028] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Revised: 05/07/2014] [Accepted: 05/24/2014] [Indexed: 12/27/2022]
Abstract
Protein-protein interactions mediate essentially all biological processes. Despite the quality of these data being widely questioned a decade ago, the reproducibility of large-scale protein interaction data is now much improved and there is little question that the latest screens are of high quality. Moreover, common data standards and coordinated curation practices between the databases that collect the interactions have made these valuable data available to a wide group of researchers. Here, I will review how protein-protein interactions are measured, collected and quality controlled. I discuss how the architecture of molecular protein networks has informed disease biology, and how these data are now being computationally integrated with the newest genomic technologies, in particular genome-wide association studies and exome-sequencing projects, to improve our understanding of molecular processes perturbed by genetics in human diseases. This article is part of a Special Issue entitled: From Genome to Function.
Collapse
Affiliation(s)
- Kasper Lage
- Department of Surgery and Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
184
|
Li X, Zhou X, Peng Y, Liu B, Zhang R, Hu J, Yu J, Jia C, Sun C. Network based integrated analysis of phenotype-genotype data for prioritization of candidate symptom genes. BIOMED RESEARCH INTERNATIONAL 2014; 2014:435853. [PMID: 24991551 PMCID: PMC4060751 DOI: 10.1155/2014/435853] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 04/30/2014] [Indexed: 11/17/2022]
Abstract
BACKGROUND Symptoms and signs (symptoms in brief) are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM). To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. METHODS This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. RESULTS The proposed method gets reliable gene rank list with AUC (area under curve) 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. CONCLUSIONS Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.
Collapse
Affiliation(s)
- Xing Li
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Xuezhong Zhou
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Yonghong Peng
- School of Engineering and Informatics, University of Bradford, West Yorkshire BD7 1DP, UK
| | - Baoyan Liu
- China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Runshun Zhang
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing 100053, China
| | - Jingqing Hu
- Institute of Basic Theory of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Jian Yu
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Caiyan Jia
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Changkai Sun
- Liaoning Provincial Key Laboratory of Cerebral Diseases, Institute for Brain Disorders, Dalian Medical University, Dalian 116044, China
| |
Collapse
|
185
|
Zhu C, Wu C, Aronow BJ, Jegga AG. Computational approaches for human disease gene prediction and ranking. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014; 799:69-84. [PMID: 24292962 DOI: 10.1007/978-1-4614-8778-4_4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
While candidate gene association studies continue to be the most practical and frequently employed approach in disease gene investigation for complex disorders, selecting suitable genes to test is a challenge. There are several computational approaches available for selecting and prioritizing disease candidate genes. A majority of these tools are based on guilt-by-association principle where novel disease candidate genes are identified and prioritized based on either functional or topological similarity to known disease genes. In this chapter we review the prioritization criteria and the algorithms along with some use cases that demonstrate how these tools can be used for identifying and ranking human disease candidate genes.
Collapse
Affiliation(s)
- Cheng Zhu
- Department of Computer Science, College of Engineering and Applied Science, University of Cincinnati, Cincinnati, OH, USA
| | | | | | | |
Collapse
|
186
|
Chepelev N, Chepelev L, Alamgir M, Golshani A. Large-Scale Protein-Protein Interaction Detection Approaches: Past, Present and Future. BIOTECHNOL BIOTEC EQ 2014. [DOI: 10.1080/13102818.2008.10817505] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
187
|
Guney E, Oliva B. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes. PLoS One 2014; 9:e94686. [PMID: 24733074 PMCID: PMC3986215 DOI: 10.1371/journal.pone.0094686] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 03/13/2014] [Indexed: 11/18/2022] Open
Abstract
Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO) analysis highlighted the role of functional diversity for such diseases.
Collapse
Affiliation(s)
- Emre Guney
- Center for Complex Network Research, Northeastern University, Boston, Massachusetts, United States of America
| | - Baldo Oliva
- Structural Bioinformatics Group (GRIB), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- * E-mail:
| |
Collapse
|
188
|
Zhang SW, Shao DD, Zhang SY, Wang YB. Prioritization of candidate disease genes by enlarging the seed set and fusing information of the network topology and gene expression. MOLECULAR BIOSYSTEMS 2014; 10:1400-8. [PMID: 24695957 DOI: 10.1039/c3mb70588a] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The identification of disease genes is very important not only to provide greater understanding of gene function and cellular mechanisms which drive human disease, but also to enhance human disease diagnosis and treatment. Recently, high-throughput techniques have been applied to detect dozens or even hundreds of candidate genes. However, experimental approaches to validate the many candidates are usually time-consuming, tedious and expensive, and sometimes lack reproducibility. Therefore, numerous theoretical and computational methods (e.g. network-based approaches) have been developed to prioritize candidate disease genes. Many network-based approaches implicitly utilize the observation that genes causing the same or similar diseases tend to correlate with each other in gene-protein relationship networks. Of these network approaches, the random walk with restart algorithm (RWR) is considered to be a state-of-the-art approach. To further improve the performance of RWR, we propose a novel method named ESFSC to identify disease-related genes, by enlarging the seed set according to the centrality of disease genes in a network and fusing information of the protein-protein interaction (PPI) network topological similarity and the gene expression correlation. The ESFSC algorithm restarts at all of the nodes in the seed set consisting of the known disease genes and their k-nearest neighbor nodes, then walks in the global network separately guided by the similarity transition matrix constructed with PPI network topological similarity properties and the correlational transition matrix constructed with the gene expression profiles. As a result, all the genes in the network are ranked by weighted fusing the above results of the RWR guided by two types of transition matrices. Comprehensive simulation results of the 10 diseases with 97 known disease genes collected from the Online Mendelian Inheritance in Man (OMIM) database show that ESFSC outperforms existing methods for prioritizing candidate disease genes. The top prediction results of Alzheimer's disease are consistent with previous literature reports.
Collapse
Affiliation(s)
- Shao-Wu Zhang
- College of Automation, Northwestern Polytechnical University, 710072, Xi'an, China.
| | | | | | | |
Collapse
|
189
|
Wang Y, Fang H, Yang T, Wu D, Zhao J. Degree‐adjusted algorithm for prioritisation of candidate disease genes from gene expression and protein interactome. IET Syst Biol 2014; 8:41-6. [DOI: 10.1049/iet-syb.2013.0038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Affiliation(s)
- Yichuan Wang
- Department of MathematicsLogistical Engineering UniversityChongqingPeople's Republic of China
| | - Haiyang Fang
- Department of MathematicsLogistical Engineering UniversityChongqingPeople's Republic of China
| | - Tinghong Yang
- Department of MathematicsLogistical Engineering UniversityChongqingPeople's Republic of China
| | - Duzhi Wu
- Department of MathematicsLogistical Engineering UniversityChongqingPeople's Republic of China
| | - Jing Zhao
- Department of MathematicsLogistical Engineering UniversityChongqingPeople's Republic of China
| |
Collapse
|
190
|
Approaches for recognizing disease genes based on network. BIOMED RESEARCH INTERNATIONAL 2014; 2014:416323. [PMID: 24707485 PMCID: PMC3953674 DOI: 10.1155/2014/416323] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Revised: 01/06/2014] [Accepted: 01/09/2014] [Indexed: 12/22/2022]
Abstract
Diseases are closely related to genes, thus indicating that genetic abnormalities may lead to certain diseases. The recognition of disease genes has long been a goal in biology, which may contribute to the improvement of health care and understanding gene functions, pathways, and interactions. However, few large-scale gene-gene association datasets, disease-disease association datasets, and gene-disease association datasets are available. A number of machine learning methods have been used to recognize disease genes based on networks. This paper states the relationship between disease and gene, summarizes the approaches used to recognize disease genes based on network, analyzes the core problems and challenges of the methods, and outlooks future research direction.
Collapse
|
191
|
Bello AM, Wei L, Majchrzak-Kita B, Salum N, Purohit MK, Fish EN, Kotra LP. Small molecule mimetics of an interferon-α receptor interacting domain. Bioorg Med Chem 2014; 22:978-85. [DOI: 10.1016/j.bmc.2013.12.049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2013] [Revised: 12/12/2013] [Accepted: 12/21/2013] [Indexed: 10/25/2022]
|
192
|
Wang XD, Huang JL, Yang L, Wei DQ, Qi YX, Jiang ZL. Identification of human disease genes from interactome network using graphlet interaction. PLoS One 2014; 9:e86142. [PMID: 24465923 PMCID: PMC3899204 DOI: 10.1371/journal.pone.0086142] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Accepted: 12/05/2013] [Indexed: 11/18/2022] Open
Abstract
Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes.
Collapse
Affiliation(s)
- Xiao-Dong Wang
- Institute of Mechanobiology and Medical Engineering, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Jia-Liang Huang
- Bioinformatics, Integrated Platform Science, GlaxoSmithKline Research and Development China, Shanghai, China
| | - Lun Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Ying-Xin Qi
- Institute of Mechanobiology and Medical Engineering, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- * E-mail:
| | - Zong-Lai Jiang
- Institute of Mechanobiology and Medical Engineering, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
193
|
Nguyen TP, Caberlotto L, Morine MJ, Priami C. Network analysis of neurodegenerative disease highlights a role of Toll-like receptor signaling. BIOMED RESEARCH INTERNATIONAL 2014; 2014:686505. [PMID: 24551850 PMCID: PMC3914352 DOI: 10.1155/2014/686505] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Revised: 11/20/2013] [Accepted: 11/30/2013] [Indexed: 01/23/2023]
Abstract
Despite significant advances in the study of the molecular mechanisms altered in the development and progression of neurodegenerative diseases (NDs), the etiology is still enigmatic and the distinctions between diseases are not always entirely clear. We present an efficient computational method based on protein-protein interaction network (PPI) to model the functional network of NDs. The aim of this work is fourfold: (i) reconstruction of a PPI network relating to the NDs, (ii) construction of an association network between diseases based on proximity in the disease PPI network, (iii) quantification of disease associations, and (iv) inference of potential molecular mechanism involved in the diseases. The functional links of diseases not only showed overlap with the traditional classification in clinical settings, but also offered new insight into connections between diseases with limited clinical overlap. To gain an expanded view of the molecular mechanisms involved in NDs, both direct and indirect connector proteins were investigated. The method uncovered molecular relationships that are in common apparently distinct diseases and provided important insight into the molecular networks implicated in disease pathogenesis. In particular, the current analysis highlighted the Toll-like receptor signaling pathway as a potential candidate pathway to be targeted by therapy in neurodegeneration.
Collapse
Affiliation(s)
- Thanh-Phuong Nguyen
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), Piazza Manifattura 1, 38068 Rovereto, Italy
| | - Laura Caberlotto
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), Piazza Manifattura 1, 38068 Rovereto, Italy
| | - Melissa J. Morine
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), Piazza Manifattura 1, 38068 Rovereto, Italy
- Department of Mathematics, University of Trento, Via Sommarive, 14-38123 Povo, Italy
| | - Corrado Priami
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), Piazza Manifattura 1, 38068 Rovereto, Italy
- Department of Mathematics, University of Trento, Via Sommarive, 14-38123 Povo, Italy
| |
Collapse
|
194
|
Nacher JC, Keith B, Schwartz JM. Network medicine analysis of chondrocyte proteins towards new treatments of osteoarthritis. Proc Biol Sci 2014; 281:20132907. [PMID: 24430851 DOI: 10.1098/rspb.2013.2907] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Osteoarthritis (OA) is a progressive disorder with high incidence in the ageing human population that still has no treatment currently. This disorder induces the breakdown of articular cartilage, leading to the exposure and damage of bone surfaces. For a global understanding of OA development, the systematic integration of known OA-related proteins with protein-protein interaction (PPI) networks is required. In this work, the OA-related interactome was reconstructed using multiple data sources to have the most up-to-date information on OA-related proteins and their interactions. We then combined emergent concepts in network medicine to detect new unclassified OA-related proteins. The mapping of known OA-related proteins with PPI networks showed that these proteins are locally connected to each other and agglomerated in a large component. To expand this module, we applied a diffusion-based algorithm that probabilistically induces more searches in the vicinity of the seed OA-related proteins. As a result, the 10 topmost ranked proteins were connected to the OA disease module, supporting the local hypothesis. We computed structural modules and selected those that had the highest enrichment of OA-related proteins. The identified molecules show a link between structural topology and disease dysfunctionality. Interestingly, the protein Q6EEV6 was highlighted for OA association by both methods, reinforcing the potential involvement of this protein. These results suggest that similar disease-connected modules may exist in different human disorders, which could lead to systematic identification of genes or proteins that have a joint role in specific disease phenotypes.
Collapse
Affiliation(s)
- Jose C Nacher
- Department of Information Science, Faculty of Science, Toho University, , Miyama 2-2-1, Funabashi, Chiba 274-8510, Japan, Faculty of Life Sciences, University of Manchester, , Manchester M13 9PT, UK
| | | | | |
Collapse
|
195
|
Wu L, Shen Y, Li M, Wu FX. Drug Target Identification Based on Structural Output Controllability of Complex Networks. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/978-3-319-08171-7_17] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
196
|
Safari-Alighiarloo N, Taghizadeh M, Rezaei-Tavirani M, Goliaei B, Peyvandi AA. Protein-protein interaction networks (PPI) and complex diseases. GASTROENTEROLOGY AND HEPATOLOGY FROM BED TO BENCH 2014; 7:17-31. [PMID: 25436094 PMCID: PMC4017556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Accepted: 12/23/2013] [Indexed: 11/16/2022]
Abstract
The physical interaction of proteins which lead to compiling them into large densely connected networks is a noticeable subject to investigation. Protein interaction networks are useful because of making basic scientific abstraction and improving biological and biomedical applications. Based on principle roles of proteins in biological function, their interactions determine molecular and cellular mechanisms, which control healthy and diseased states in organisms. Therefore, such networks facilitate the understanding of pathogenic (and physiologic) mechanisms that trigger the onset and progression of diseases. Consequently, this knowledge can be translated into effective diagnostic and therapeutic strategies. Furthermore, the results of several studies have proved that the structure and dynamics of protein networks are disturbed in complex diseases such as cancer and autoimmune disorders. Based on such relationship, a novel paradigm is suggested in order to confirm that the protein interaction networks can be the target of therapy for treatment of complex multi-genic diseases rather than individual molecules with disrespect the network.
Collapse
Affiliation(s)
- Nahid Safari-Alighiarloo
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Taghizadeh
- Bioinformatics Department, Institute of Biochemistry and Biophysics, Tehran University, Tehran, Iran
| | - Mostafa Rezaei-Tavirani
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Bahram Goliaei
- Bioinformatics Department, Institute of Biochemistry and Biophysics, Tehran University, Tehran, Iran
| | - Ali Asghar Peyvandi
- Hearing Disorders Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
197
|
Siddani BR, Pochineni LP, Palanisamy M. Candidate gene identification for systemic lupus erythematosus using network centrality measures and gene ontology. PLoS One 2013; 8:e81766. [PMID: 24312583 PMCID: PMC3847089 DOI: 10.1371/journal.pone.0081766] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 10/16/2013] [Indexed: 01/12/2023] Open
Abstract
Systemic lupus erythematosus (SLE) commonly accredited as “the great imitator” is a highly complex disease involving multiple gene susceptibility with non-specific symptoms. Many experimental and computational approaches have been used to investigate the disease related candidate genes. But the limited knowledge of gene function and disease correlation and also lack of complete functional details about the majority of genes in susceptible locus, encumbrances the identification of SLE related candidate genes. In this paper, we have studied the human immunome network (undirected) using various graph theoretical centrality measures integrated with the gene ontology terms to predict the new candidate genes. As a result, we have identified 8 candidate genes, which may act as potential targets for SLE disease. We have also carried out the same analysis by replacing the human immunome network with human immunome signaling network (directed) and as an outcome we have obtained 5 candidate genes as potential targets for SLE disease. From the comparison study, we have found these two approaches are complementary in nature.
Collapse
Affiliation(s)
- Bhaskara Rao Siddani
- C R Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad, India
| | | | | |
Collapse
|
198
|
Leiserson MDM, Eldridge JV, Ramachandran S, Raphael BJ. Network analysis of GWAS data. Curr Opin Genet Dev 2013; 23:602-10. [PMID: 24287332 PMCID: PMC3867794 DOI: 10.1016/j.gde.2013.09.003] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Revised: 09/19/2013] [Accepted: 09/23/2013] [Indexed: 02/07/2023]
Abstract
Genome-wide association studies (GWAS) identify genetic variants that distinguish a control population from a population with a specific trait. Two challenges in GWAS are: (1) identification of the causal variant within a longer haplotype that is associated with the trait; (2) identification of causal variants for polygenic traits that are caused by variants in multiple genes within a pathway. We review recent methods that use information in protein-protein and protein-DNA interaction networks to address these two challenges.
Collapse
Affiliation(s)
- Mark D M Leiserson
- Department of Computer Science, Brown University, Providence, RI 02912, United States; Center for Computational Molecular Biology, Brown University, Providence, RI 02912, United States
| | | | | | | |
Collapse
|
199
|
Controllability in cancer metabolic networks according to drug targets as driver nodes. PLoS One 2013; 8:e79397. [PMID: 24282504 PMCID: PMC3839908 DOI: 10.1371/journal.pone.0079397] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 09/30/2013] [Indexed: 11/24/2022] Open
Abstract
Networks are employed to represent many nonlinear complex systems in the real world. The topological aspects and relationships between the structure and function of biological networks have been widely studied in the past few decades. However dynamic and control features of complex networks have not been widely researched, in comparison to topological network features. In this study, we explore the relationship between network controllability, topological parameters, and network medicine (metabolic drug targets). Considering the assumption that targets of approved anticancer metabolic drugs are driver nodes (which control cancer metabolic networks), we have applied topological analysis to genome-scale metabolic models of 15 normal and corresponding cancer cell types. The results show that besides primary network parameters, more complex network metrics such as motifs and clusters may also be appropriate for controlling the systems providing the controllability relationship between topological parameters and drug targets. Consequently, this study reveals the possibilities of following a set of driver nodes in network clusters instead of considering them individually according to their centralities. This outcome suggests considering distributed control systems instead of nodal control for cancer metabolic networks, leading to a new strategy in the field of network medicine.
Collapse
|
200
|
Kimmel C, Visweswaran S. An algorithm for network-based gene prioritization that encodes knowledge both in nodes and in links. PLoS One 2013; 8:e79564. [PMID: 24260251 PMCID: PMC3834271 DOI: 10.1371/journal.pone.0079564] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Accepted: 09/25/2013] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Candidate gene prioritization aims to identify promising new genes associated with a disease or a biological process from a larger set of candidate genes. In recent years, network-based methods - which utilize a knowledge network derived from biological knowledge - have been utilized for gene prioritization. Biological knowledge can be encoded either through the network's links or nodes. Current network-based methods can only encode knowledge through links. This paper describes a new network-based method that can encode knowledge in links as well as in nodes. RESULTS We developed a new network inference algorithm called the Knowledge Network Gene Prioritization (KNGP) algorithm which can incorporate both link and node knowledge. The performance of the KNGP algorithm was evaluated on both synthetic networks and on networks incorporating biological knowledge. The results showed that the combination of link knowledge and node knowledge provided a significant benefit across 19 experimental diseases over using link knowledge alone or node knowledge alone. CONCLUSIONS The KNGP algorithm provides an advance over current network-based algorithms, because the algorithm can encode both link and node knowledge. We hope the algorithm will aid researchers with gene prioritization.
Collapse
Affiliation(s)
- Chad Kimmel
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|