Suratanee A, Plaimas K. Network-based association analysis to infer new disease-gene relationships using large-scale protein interactions.
PLoS One 2018;
13:e0199435. [PMID:
29949603 PMCID:
PMC6021074 DOI:
10.1371/journal.pone.0199435]
[Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 06/07/2018] [Indexed: 01/02/2023] Open
Abstract
Protein-protein interactions integrated with disease-gene associations represent important information for revealing protein functions under disease conditions to improve the prevention, diagnosis, and treatment of complex diseases. Although several studies have attempted to identify disease-gene associations, the number of possible disease-gene associations is very small. High-throughput technologies have been established experimentally to identify the association between genes and diseases. However, these techniques are still quite expensive, time consuming, and even difficult to perform. Thus, based on currently available data and knowledge, computational methods have served as alternatives to provide more possible associations to increase our understanding of disease mechanisms. Here, a new network-based algorithm, namely, Disease-Gene Association (DGA), was developed to calculate the association score of a query gene to a new possible set of diseases. First, a large-scale protein interaction network was constructed, and the relationship between two interacting proteins was calculated with regard to the disease relationship. Novel plausible disease-gene pairs were identified and statistically scored by our algorithm using neighboring protein information. The results yielded high performance for disease-gene prediction, with an F-measure of 0.78 and an AUC of 0.86. To identify promising candidates of disease-gene associations, the association coverage of genes and diseases were calculated and used with the association score to perform gene and disease selection. Based on gene selection, we identified promising pairs that exhibited evidence related to several important diseases, e.g., inflammation, lipid metabolism, inborn errors, xanthomatosis, cerebellar ataxia, cognitive deterioration, malignant neoplasms of the skin and malignant tumors of the cervix. Focusing on disease selection, we identified target genes that were important to blistering skin diseases and muscular dystrophy. In summary, our developed algorithm is simple, efficiently identifies disease–gene associations in the protein-protein interaction network and provides additional knowledge regarding disease-gene associations. This method can be generalized to other association studies to further advance biomedical science.
Collapse