Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tiffin N, Andrade-Navarro MA, Perez-Iratxeta C. Linking genes to diseases: it's all in the data. Genome Med 2009;1:77. [PMID: 19678910 PMCID: PMC2768963 DOI: 10.1186/gm77] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

For:	Tiffin N, Andrade-Navarro MA, Perez-Iratxeta C. Linking genes to diseases: it's all in the data. Genome Med 2009;1:77. [PMID: 19678910 PMCID: PMC2768963 DOI: 10.1186/gm77] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

He B, Wang K, Xiang J, Bing P, Tang M, Tian G, Guo C, Xu M, Yang J. DGHNE: network enhancement-based method in identifying disease-causing genes through a heterogeneous biomedical network. Brief Bioinform 2022;23:6712302. [PMID: 36151744 DOI: 10.1093/bib/bbac405] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/01/2022] [Accepted: 08/21/2022] [Indexed: 12/14/2022] Open

Abstract

The identification of disease-causing genes is critical for mechanistic understanding of disease etiology and clinical manipulation in disease prevention and treatment. Yet the existing approaches in tackling this question are inadequate in accuracy and efficiency, demanding computational methods with higher identification power. Here, we proposed a new method called DGHNE to identify disease-causing genes through a heterogeneous biomedical network empowered by network enhancement. First, a disease-disease association network was constructed by the cosine similarity scores between phenotype annotation vectors of diseases, and a new heterogeneous biomedical network was constructed by using disease-gene associations to connect the disease-disease network and gene-gene network. Then, the heterogeneous biomedical network was further enhanced by using network embedding based on the Gaussian random projection. Finally, network propagation was used to identify candidate genes in the enhanced network. We applied DGHNE together with five other methods into the most updated disease-gene association database termed DisGeNet. Compared with all other methods, DGHNE displayed the highest area under the receiver operating characteristic curve and the precision-recall curve, as well as the highest precision and recall, in both the global 5-fold cross-validation and predicting new disease-gene associations. We further performed DGHNE in identifying the candidate causal genes of Parkinson's disease and diabetes mellitus, and the genes connecting hyperglycemia and diabetes mellitus. In all cases, the predicted causing genes were enriched in disease-associated gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, and the gene-disease associations were highly evidenced by independent experimental studies.

Collapse

Pavel A, Serra A, Cattelani L, Federico A, Greco D. Network Analysis of Microarray Data. Methods Mol Biol 2022;2401:161-186. [PMID: 34902128 DOI: 10.1007/978-1-0716-1839-4_11] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Zolotareva O, Kleine M. A Survey of Gene Prioritization Tools for Mendelian and Complex Human Diseases. J Integr Bioinform 2019;16:/j/jib.ahead-of-print/jib-2018-0069/jib-2018-0069.xml. [PMID: 31494632 PMCID: PMC7074139 DOI: 10.1515/jib-2018-0069] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 07/12/2019] [Indexed: 12/16/2022] Open

GPS: Identification of disease genes by rank aggregation of multi-genomic scoring schemes. Genomics 2019;111:612-618. [DOI: 10.1016/j.ygeno.2018.03.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 03/16/2018] [Accepted: 03/21/2018] [Indexed: 12/19/2022]

Freytag S, Burgess R, Oliver KL, Bahlo M. brain-coX: investigating and visualising gene co-expression in seven human brain transcriptomic datasets. Genome Med 2017;9:55. [PMID: 28595657 PMCID: PMC5465565 DOI: 10.1186/s13073-017-0444-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 05/26/2017] [Indexed: 12/17/2022] Open

Li S, Li R, Wang H, Li L, Li H, Li Y. The Key Genes of Chronic Pancreatitis which Bridge Chronic Pancreatitis and Pancreatic Cancer Can be Therapeutic Targets. Pathol Oncol Res 2017;24:215-222. [PMID: 28435988 DOI: 10.1007/s12253-017-0217-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 03/24/2017] [Indexed: 01/15/2023]

Chen Y, Li L, Zhang GQ, Xu R. Phenome-driven disease genetics prediction toward drug discovery. Bioinformatics 2015;31:i276-83. [PMID: 26072493 PMCID: PMC4542779 DOI: 10.1093/bioinformatics/btv245] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Abstract

MOTIVATION

Discerning genetic contributions to diseases not only enhances our understanding of disease mechanisms, but also leads to translational opportunities for drug discovery. Recent computational approaches incorporate disease phenotypic similarities to improve the prediction power of disease gene discovery. However, most current studies used only one data source of human disease phenotype. We present an innovative and generic strategy for combining multiple different data sources of human disease phenotype and predicting disease-associated genes from integrated phenotypic and genomic data.

RESULTS

To demonstrate our approach, we explored a new phenotype database from biomedical ontologies and constructed Disease Manifestation Network (DMN). We combined DMN with mimMiner, which was a widely used phenotype database in disease gene prediction studies. Our approach achieved significantly improved performance over a baseline method, which used only one phenotype data source. In the leave-one-out cross-validation and de novo gene prediction analysis, our approach achieved the area under the curves of 90.7% and 90.3%, which are significantly higher than 84.2% (P < e(-4)) and 81.3% (P < e(-12)) for the baseline approach. We further demonstrated that our predicted genes have the translational potential in drug discovery. We used Crohn's disease as an example and ranked the candidate drugs based on the rank of drug targets. Our gene prediction approach prioritized druggable genes that are likely to be associated with Crohn's disease pathogenesis, and our rank of candidate drugs successfully prioritized the Food and Drug Administration-approved drugs for Crohn's disease. We also found literature evidence to support a number of drugs among the top 200 candidates. In summary, we demonstrated that a novel strategy combining unique disease phenotype data with system approaches can lead to rapid drug discovery.

AVAILABILITY AND IMPLEMENTATION

nlp.

CASE

edu/public/data/DMN

Collapse

Network regularised Cox regression and multiplex network models to predict disease comorbidities and survival of cancer. Comput Biol Chem 2015;59 Pt B:15-31. [DOI: 10.1016/j.compbiolchem.2015.08.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Revised: 08/21/2015] [Accepted: 08/25/2015] [Indexed: 12/17/2022]

Theofilatos KA, Likothanassis S, Mavroudi S. Quo vadis computational analysis of PPI data or why the future isn't here yet. Front Genet 2015;6:289. [PMID: 26442107 PMCID: PMC4584938 DOI: 10.3389/fgene.2015.00289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 08/31/2015] [Indexed: 11/13/2022] Open

Emran NA. Data Completeness Measures. PATTERN ANALYSIS, INTELLIGENT SECURITY AND THE INTERNET OF THINGS 2015:117-130. [DOI: 10.1007/978-3-319-17398-6_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Assessment of curated phenotype mining in neuropsychiatric disorder literature. Methods 2014;74:90-6. [PMID: 25484337 DOI: 10.1016/j.ymeth.2014.11.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Revised: 11/25/2014] [Accepted: 11/27/2014] [Indexed: 12/14/2022] Open

Abstract

Clinical evaluation of patients and diagnosis of disorder is crucial to make decisions on appropriate therapies. In addition, in the case of genetic disorders resulting from gene abnormalities, phenotypic effects may guide basic research on the mechanisms of a disorder to find the mutated gene and therefore to propose novel targets for drug therapy. However, this approach is complicated by two facts. First, the relationship between genes and disorders is not simple: one gene may be related to multiple disorders and a disorder may be caused by mutations in different genes. Second, recognizing relevant phenotypes might be difficult for clinicians working with patients of closely related complex disorders. Neuropsychiatric disorders best illustrate these difficulties since phenotypes range from metabolic to behavioral aspects, the latter extremely complex. Based on our clinical expertise on five neurodegenerative disorders, and from the wealth of bibliographical data on neuropsychiatric disorders, we have built a resource to infer associations between genes, chemicals, phenotypes for a total of 31 disorders. An initial step of automated text mining of the literature related to 31 disorders returned thousands of enriched terms. Fewer relevant phenotypic terms were manually selected by clinicians as relevant to the five neural disorders of their expertise and used to analyze the complete set of disorders. Analysis of the data indicates general relationships between neuropsychiatric disorders, which can be used to classify and characterize them. Correlation analyses allowed us to propose novel associations of genes and drugs with disorders. More generally, the results led us to uncovering mechanisms of disease that span multiple neuropsychiatric disorders, for example that genes related to synaptic transmission and receptor functions tend to be involved in many disorders, whereas genes related to sensory perception and channel transport functions are associated with fewer disorders. Our study shows that starting from expertise covering a limited set of neurological disorders and using text and data mining methods, meaningful and novel associations regarding genes, chemicals and phenotypes can be derived for an expanded set of neuropsychiatric disorders. Our results are intended for clinicians to help them evaluate patients, and for basic scientists to propose new gene targets for drug therapies. This strategy can be extended to virtually all diseases and takes advantage of the ever increasing amount of biomedical literature.

Collapse

Chen Y, Xu R. Mining cancer-specific disease comorbidities from a large observational health database. Cancer Inform 2014;13:37-44. [PMID: 25392682 PMCID: PMC4216041 DOI: 10.4137/cin.s13893] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2014] [Revised: 04/29/2014] [Accepted: 04/30/2014] [Indexed: 12/28/2022] Open

Chen Y, Zhang X, Zhang GQ, Xu R. Comparative analysis of a novel disease phenotype network based on clinical manifestations. J Biomed Inform 2014;53:113-20. [PMID: 25277758 DOI: 10.1016/j.jbi.2014.09.007] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 08/18/2014] [Accepted: 09/21/2014] [Indexed: 12/21/2022]

Liu RL, Shih CC. Identification of highly related references about gene-disease association. BMC Bioinformatics 2014;15:286. [PMID: 25155502 PMCID: PMC4162969 DOI: 10.1186/1471-2105-15-286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2013] [Accepted: 08/12/2014] [Indexed: 02/03/2023] Open

Valentini G, Paccanaro A, Caniza H, Romero AE, Re M. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif Intell Med 2014;61:63-78. [PMID: 24726035 PMCID: PMC4070077 DOI: 10.1016/j.artmed.2014.03.003] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Revised: 03/05/2014] [Accepted: 03/10/2014] [Indexed: 02/07/2023]

Abstract

OBJECTIVE

In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization.

MATERIALS AND METHODS

We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions.

RESULTS

The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation.

CONCLUSIONS

Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network.

Collapse

High-Throughput Translational Medicine: Challenges and Solutions. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014;799:39-67. [DOI: 10.1007/978-1-4614-8778-4_3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Network Analysis of Human Disease Comorbidity Patterns Based on Large-Scale Data Mining. BIOINFORMATICS RESEARCH AND APPLICATIONS 2014. [DOI: 10.1007/978-3-319-08171-7_22] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Emran NA, Embury S, Missier P. Measuring Population-Based Completeness for Single Nucleotide Polymorphism (SNP) Databases. ADVANCED APPROACHES TO INTELLIGENT INFORMATION AND DATABASE SYSTEMS 2014:173-182. [DOI: 10.1007/978-3-319-05503-9_17] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Nayak L, Tunga H, De RK. Disease co-morbidity and the human Wnt signaling pathway: a network-wise study. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013;17:318-37. [PMID: 23692364 DOI: 10.1089/omi.2012.0053] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Kamphans T, Sabri P, Zhu N, Heinrich V, Mundlos S, Robinson PN, Parkhomchuk D, Krawitz PM. Filtering for compound heterozygous sequence variants in non-consanguineous pedigrees. PLoS One 2013;8:e70151. [PMID: 23940540 PMCID: PMC3734130 DOI: 10.1371/journal.pone.0070151] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Accepted: 06/20/2013] [Indexed: 01/06/2023] Open

Xu R, Li L, Wang Q. Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature. Bioinformatics 2013;29:2186-94. [PMID: 23828786 DOI: 10.1093/bioinformatics/btt359] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Abstract

MOTIVATION

Systems approaches to studying phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repurposing. Currently, systematic study of disease phenotypic relationships on a phenome-wide scale is limited because large-scale machine-understandable disease-phenotype relationship knowledge bases are often unavailable. Here, we present an automatic approach to extract disease-manifestation (D-M) pairs (one specific type of disease-phenotype relationship) from the wide body of published biomedical literature.

DATA AND METHODS

Our method leverages external knowledge and limits the amount of human effort required. For the text corpus, we used 119 085 682 MEDLINE sentences (21 354 075 citations). First, we used D-M pairs from existing biomedical ontologies as prior knowledge to automatically discover D-M-specific syntactic patterns. We then extracted additional pairs from MEDLINE using the learned patterns. Finally, we analysed correlations between disease manifestations and disease-associated genes and drugs to demonstrate the potential of this newly created knowledge base in disease gene discovery and drug repurposing.

RESULTS

In total, we extracted 121 359 unique D-M pairs with a high precision of 0.924. Among the extracted pairs, 120 419 (99.2%) have not been captured in existing structured knowledge sources. We have shown that disease manifestations correlate positively with both disease-associated genes and drug treatments.

CONCLUSIONS

The main contribution of our study is the creation of a large-scale and accurate D-M phenotype relationship knowledge base. This unique knowledge base, when combined with existing phenotypic, genetic and proteomic datasets, can have profound implications in our deeper understanding of disease etiology and in rapid drug repurposing.

AVAILABILITY

http://nlp.case.edu/public/data/DMPatternUMLS/

Collapse

Nie Y, Yu J. Mining breast cancer genes with a network based noise-tolerant approach. BMC SYSTEMS BIOLOGY 2013;7:49. [PMID: 23799982 PMCID: PMC3702465 DOI: 10.1186/1752-0509-7-49] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Accepted: 06/21/2013] [Indexed: 12/22/2022]

Abstract

BACKGROUND

Mining novel breast cancer genes is an important task in breast cancer research. Many approaches prioritize candidate genes based on their similarity to known cancer genes, usually by integrating multiple data sources. However, different types of data often contain varying degrees of noise. For effective data integration, it's important to design methods that work robustly with respect to noise.

RESULTS

Gene Ontology (GO) annotations were often utilized in cancer gene mining works. However, the vast majority of GO annotations were computationally derived, thus not completely accurate. A set of genes annotated with breast cancer enriched GO terms was adopted here as a set of source data with realistic noise. A novel noise tolerant approach was proposed to rank candidate breast cancer genes using noisy source data within the framework of a comprehensive human Protein-Protein Interaction (PPI) network. Performance of the proposed method was quantitatively evaluated by comparing it with the more established random walk approach. Results showed that the proposed method exhibited better performance in ranking known breast cancer genes and higher robustness against data noise than the random walk approach. When noise started to increase, the proposed method was able to maintained relatively stable performance, while the random walk approach showed drastic performance decline; when noise increased to a large extent, the proposed method was still able to achieve better performance than random walk did.

CONCLUSIONS

A novel noise tolerant method was proposed to mine breast cancer genes. Compared to the well established random walk approach, it showed better performance in correctly ranking cancer genes and worked robustly with respect to noise within source data. To the best of our knowledge, it's the first such effort to quantitatively analyze noise tolerance between different breast cancer gene mining methods. The sorted gene list can be valuable for breast cancer research. The proposed quantitative noise analysis method may also prove useful for other data integration efforts. It is hoped that the current work can lead to more discussions about influence of data noise on different computational methods for mining disease genes.

Collapse

Emran NA, Embury S, Missier P, Ahmad N. Reference Architectures to Measure Data Completeness across Integrated Databases. INTELLIGENT INFORMATION AND DATABASE SYSTEMS 2013:216-225. [DOI: 10.1007/978-3-642-36546-1_23] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Emran NA, Embury S, Missier P, Isa MNM, Muda AK. Measuring Data Completeness for Microbial Genomics Database. INTELLIGENT INFORMATION AND DATABASE SYSTEMS 2013:186-195. [DOI: 10.1007/978-3-642-36546-1_20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Gonçalves JP, Francisco AP, Moreau Y, Madeira SC. Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores. PLoS One 2012. [PMID: 23185389 PMCID: PMC3501465 DOI: 10.1371/journal.pone.0049634] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Disease gene prioritization aims to suggest potential implications of genes in disease susceptibility. Often accomplished in a guilt-by-association scheme, promising candidates are sorted according to their relatedness to known disease genes. Network-based methods have been successfully exploiting this concept by capturing the interaction of genes or proteins into a score. Nonetheless, most current approaches yield at least some of the following limitations: (1) networks comprise only curated physical interactions leading to poor genome coverage and density, and bias toward a particular source; (2) scores focus on adjacencies (direct links) or the most direct paths (shortest paths) within a constrained neighborhood around the disease genes, ignoring potentially informative indirect paths; (3) global clustering is widely applied to partition the network in an unsupervised manner, attributing little importance to prior knowledge; (4) confidence weights and their contribution to edge differentiation and ranking reliability are often disregarded. We hypothesize that network-based prioritization related to local clustering on graphs and considering full topology of weighted gene association networks integrating heterogeneous sources should overcome the above challenges. We term such a strategy Interactogeneous. We conducted cross-validation tests to assess the impact of network sources, alternative path inclusion and confidence weights on the prioritization of putative genes for 29 diseases. Heat diffusion ranking proved the best prioritization method overall, increasing the gap to neighborhood and shortest paths scores mostly on single source networks. Heterogeneous associations consistently delivered superior performance over single source data across the majority of methods. Results on the contribution of confidence weights were inconclusive. Finally, the best Interactogeneous strategy, heat diffusion ranking and associations from the STRING database, was used to prioritize genes for Parkinson’s disease. This method effectively recovered known genes and uncovered interesting candidates which could be linked to pathogenic mechanisms of the disease.

Collapse

Andrade-Navarro MA. Mining the literature: new methods to exploit keyword profiles. Genome Med 2012;4:81. [PMID: 23114100 PMCID: PMC3580450 DOI: 10.1186/gm382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Börnigen D, Tranchevent LC, Bonachela-Capdevila F, Devriendt K, De Moor B, De Causmaecker P, Moreau Y. An unbiased evaluation of gene prioritization tools. Bioinformatics 2012;28:3081-8. [PMID: 23047555 DOI: 10.1093/bioinformatics/bts581] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Cheung WA, Ouellette BF, Wasserman WW. Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles. Genome Med 2012;4:75. [PMID: 23021552 PMCID: PMC3580445 DOI: 10.1186/gm376] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2012] [Revised: 09/11/2012] [Accepted: 09/28/2012] [Indexed: 01/08/2023] Open

Magger O, Waldman YY, Ruppin E, Sharan R. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Comput Biol 2012;8:e1002690. [PMID: 23028288 PMCID: PMC3459874 DOI: 10.1371/journal.pcbi.1002690] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2011] [Accepted: 07/28/2012] [Indexed: 01/07/2023] Open

Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet 2012;13:523-36. [DOI: 10.1038/nrg3253] [Citation(s) in RCA: 332] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Doncheva NT, Kacprowski T, Albrecht M. Recent approaches to the prioritization of candidate disease genes. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2012;4:429-42. [PMID: 22689539 DOI: 10.1002/wsbm.1177] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Gilissen C, Hoischen A, Brunner HG, Veltman JA. Disease gene identification strategies for exome sequencing. Eur J Hum Genet 2012;20:490-7. [PMID: 22258526 PMCID: PMC3330229 DOI: 10.1038/ejhg.2011.258] [Citation(s) in RCA: 310] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2011] [Revised: 10/31/2011] [Accepted: 12/07/2011] [Indexed: 12/16/2022] Open

Piro RM, Di Cunto F. Computational approaches to disease-gene prediction: rationale, classification and successes. FEBS J 2012;279:678-96. [PMID: 22221742 DOI: 10.1111/j.1742-4658.2012.08471.x] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Li X, Li C, Shang D, Li J, Han J, Miao Y, Wang Y, Wang Q, Li W, Wu C, Zhang Y, Li X, Yao Q. The implications of relationships between human diseases and metabolic subpathways. PLoS One 2011;6:e21131. [PMID: 21695054 PMCID: PMC3117879 DOI: 10.1371/journal.pone.0021131] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2011] [Accepted: 05/20/2011] [Indexed: 01/08/2023] Open

Abstract

One of the challenging problems in the etiology of diseases is to explore the relationships between initiation and progression of diseases and abnormalities in local regions of metabolic pathways. To gain insight into such relationships, we applied the “k-clique” subpathway identification method to all disease-related gene sets. For each disease, the disease risk regions of metabolic pathways were then identified and considered as subpathways associated with the disease. We finally built a disease-metabolic subpathway network (DMSPN). Through analyses based on network biology, we found that a few subpathways, such as that of cytochrome P450, were highly connected with many diseases, and most belonged to fundamental metabolisms, suggesting that abnormalities of fundamental metabolic processes tend to cause more types of diseases. According to the categories of diseases and subpathways, we tested the clustering phenomenon of diseases and metabolic subpathways in the DMSPN. The results showed that both disease nodes and subpathway nodes displayed slight clustering phenomenon. We also tested correlations between network topology and genes within disease-related metabolic subpathways, and found that within a disease-related subpathway in the DMSPN, the ratio of disease genes and the ratio of tissue-specific genes significantly increased as the number of diseases caused by the subpathway increased. Surprisingly, the ratio of essential genes significantly decreased and the ratio of housekeeping genes remained relatively unchanged. Furthermore, the coexpression levels between disease genes and other types of genes were calculated for each subpathway in the DMSPN. The results indicated that those genes intensely influenced by disease genes, including essential genes and tissue-specific genes, might be significantly associated with the disease diversity of subpathways, suggesting that different kinds of genes within a disease-related subpathway may play significantly differential roles on the diversity of diseases caused by the corresponding subpathway.

Collapse

Oti M, Ballouz S, Wouters MA. Web tools for the prioritization of candidate disease genes. Methods Mol Biol 2011;760:189-206. [PMID: 21779998 DOI: 10.1007/978-1-61779-176-5_12] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Lacson R, Mbagwu M, Yousif H, Ohno-Machado L. Assessing the quality of annotations in asthma gene expression experiments. BMC Bioinformatics 2010;11 Suppl 9:S8. [PMID: 21044366 PMCID: PMC2967749 DOI: 10.1186/1471-2105-11-s9-s8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open