51
|
Hao S, Wang R, Zhang Y, Zhan H. Prediction of Alzheimer's Disease-Associated Genes by Integration of GWAS Summary Data and Expression Data. Front Genet 2019; 9:653. [PMID: 30666269 PMCID: PMC6330278 DOI: 10.3389/fgene.2018.00653] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 12/03/2018] [Indexed: 12/20/2022] Open
Abstract
Alzheimer's disease (AD) is the most common cause of dementia. It is the fifth leading cause of death among elderly people. With high genetic heritability (79%), finding the disease's causal genes is a crucial step in finding a treatment for AD. Following the International Genomics of Alzheimer's Project (IGAP), many disease-associated genes have been identified; however, we do not have enough knowledge about how those disease-associated genes affect gene expression and disease-related pathways. We integrated GWAS summary data from IGAP and five different expression-level data by using the transcriptome-wide association study method and identified 15 disease-causal genes under strict multiple testing (α < 0.05), and four genes are newly identified. We identified an additional 29 potential disease-causal genes under a false discovery rate (α < 0.05), and 21 of them are newly identified. Many genes we identified are also associated with an autoimmune disorder.
Collapse
Affiliation(s)
- Sicheng Hao
- College of Computer and Information Science, Northeastern University, Boston, MA, United States
| | - Rui Wang
- College of Computer and Information Science, Northeastern University, Boston, MA, United States
| | - Yu Zhang
- Department of Neurosurgery, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Hui Zhan
- College of Electronic Engineering, Heilongjiang University, Harbin, China
| |
Collapse
|
52
|
Abstract
BACKGROUND Recently, measuring phenotype similarity began to play an important role in disease diagnosis. Researchers have begun to pay attention to develop phenotype similarity measurement. However, existing methods ignore the interactions between phenotype-associated proteins, which may lead to inaccurate phenotype similarity. RESULTS We proposed a network-based method PhenoNet to calculate the similarity between phenotypes. We localized phenotypes in the network and calculated the similarity between phenotype-associated modules by modeling both the inter- and intra-similarity. CONCLUSIONS PhenoNet was evaluated on two independent evaluation datasets: gene ontology and gene expression data. The result shows that PhenoNet performs better than the state-of-art methods on all evaluation tests.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Weiwei Hui
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| |
Collapse
|
53
|
Wang Z, Wu X, Wang Y. A framework for analyzing DNA methylation data from Illumina Infinium HumanMethylation450 BeadChip. BMC Bioinformatics 2018; 19:115. [PMID: 29671397 PMCID: PMC5907140 DOI: 10.1186/s12859-018-2096-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background DNA methylation has been identified to be widely associated to complex diseases. Among biological platforms to profile DNA methylation in human, the Illumina Infinium HumanMethylation450 BeadChip (450K) has been accepted as one of the most efficient technologies. However, challenges exist in analysis of DNA methylation data generated by this technology due to widespread biases. Results Here we proposed a generalized framework for evaluating data analysis methods for Illumina 450K array. This framework considers the following steps towards a successful analysis: importing data, quality control, within-array normalization, correcting type bias, detecting differentially methylated probes or regions and biological interpretation. Conclusions We evaluated five methods using three real datasets, and proposed outperform methods for the Illumina 450K array data analysis. Minfi and methylumi are optimal choice when analyzing small dataset. BMIQ and RCP are proper to correcting type bias and the normalized result of them can be used to discover DMPs. R package missMethyl is suitable for GO term enrichment analysis and biological interpretation.
Collapse
Affiliation(s)
- Zhenxing Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
| | - XiaoLiang Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| |
Collapse
|
54
|
Hao X, Hao J, Wang L, Hou H. Effective norm emergence in cell systems under limited communication. BMC Bioinformatics 2018; 19:119. [PMID: 29671391 PMCID: PMC5907317 DOI: 10.1186/s12859-018-2097-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background The cooperation of cells in biological systems is similar to that of agents in cooperative multi-agent systems. Research findings in multi-agent systems literature can provide valuable inspirations to biological research. The well-coordinated states in cell systems can be viewed as desirable social norms in cooperative multi-agent systems. One important research question is how a norm can rapidly emerge with limited communication resources. Results In this work, we propose a learning approach which can trade off the agents’ performance of coordinating on a consistent norm and the communication cost involved. During the learning process, the agents can dynamically adjust their coordination set according to their own observations and pick out the most crucial agents to coordinate with. In this way, our method significantly reduces the coordination dependence among agents. Conclusion The experiment results show that our method can efficiently facilitate the social norm emergence among agents, and also scale well to large-scale populations.
Collapse
Affiliation(s)
- Xiaotian Hao
- School of Computer Science and Software, Tianjin University, Peiyang Park Campus: No.135 Yaguan Road, Haihe Education Park, Tianjin, 300350, China
| | - Jianye Hao
- School of Computer Science and Software, Tianjin University, Peiyang Park Campus: No.135 Yaguan Road, Haihe Education Park, Tianjin, 300350, China
| | - Li Wang
- School of Computer Science and Software, Tianjin University, Peiyang Park Campus: No.135 Yaguan Road, Haihe Education Park, Tianjin, 300350, China.
| | - Hanxu Hou
- School of Electrical Engineering and Intelligentization, Dongguan University of Technology, No. 1, university road, songshan lake district, dongguan, 221116, China.
| |
Collapse
|
55
|
Sun S, Sun X, Zheng Y. Higher-order partial least squares for predicting gene expression levels from chromatin states. BMC Bioinformatics 2018; 19:113. [PMID: 29671394 PMCID: PMC5907142 DOI: 10.1186/s12859-018-2100-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Background Extensive studies have shown that gene expression levels are strongly affected by chromatin mark combinations via at least two mechanisms, i.e., activation or repression. But their combinatorial patterns are still unclear. To further understand the relationship between histone modifications and gene expression levels, here in this paper, we introduce a purely geometric higher-order representation, tensor (also called multidimensional array), which might borrow more unknown interactions in chromatin states to predicting gene expression levels. Results The prediction models were learned from regions around upstream 10k base pairs and downstream 10k base pairs of the transcriptional start sites (TSSs) on three species (i.e., Human, Rhesus Macaque, and Chimpanzee) with five histone modifications (i.e., H3K4me1, H3K4me3, H3K27ac, H3K27me3, and Pol II). Experimental results demonstrate that the proposed method is more powerful to predicting gene expression levels than several other popular methods. Specifically, our method enable to get more powerful performance on both commonly used criteria, R and RMSE, as high as 1.7% and 11%, respectively. Conclusions The overall aim of this work is to show that the higher-order representation is able to include more unknown interaction information between histone modifications across different species.
Collapse
Affiliation(s)
- Shiquan Sun
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China. .,Department of Biostatistics, University of Michigan, Ann Arbor, 48109, MI, USA.
| | - Xifang Sun
- School of Science, Xi'an Shiyou University, Xi'an, 710065, Shaanxi, People's Republic of China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
| |
Collapse
|
56
|
Tou H, Yao L, Wei Z, Zhuang X, Zhang B. Automatic infection detection based on electronic medical records. BMC Bioinformatics 2018; 19:117. [PMID: 29671399 PMCID: PMC5907141 DOI: 10.1186/s12859-018-2101-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Making accurate patient care decision, as early as possible, is a constant challenge, especially for physicians in the emergency department. The increasing volumes of electronic medical records (EMRs) open new horizons for automatic diagnosis. In this paper, we propose to use machine learning approaches for automatic infection detection based on EMRs. Five categories of information are utilized for prediction, including personal information, admission note, vital signs, diagnose test results and medical image diagnose. RESULTS Experimental results on a newly constructed EMRs dataset from emergency department show that machine learning models can achieve a decent performance for infection detection with area under the receiver operator characteristic curve (AUC) of 0.88. Out of all the five types of information, admission note in text form makes the most contribution with the AUC of 0.87. CONCLUSIONS This study provides a state-of-the-art EMRs processing system to automatically make medical decisions. It extracts five types of features associated with infection and achieves a decent performance on automatic infection detection based on machine learning models.
Collapse
Affiliation(s)
- Huaixiao Tou
- School of Data Science, Fudan University, Shanghai, China
| | - Lu Yao
- Zhongshan Hospital Affiliated to Fudan University, Shanghai, China
| | - Zhongyu Wei
- School of Data Science, Fudan University, Shanghai, China.
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China
| | - Bo Zhang
- Zhongshan Hospital Affiliated to Fudan University, Shanghai, China.
| |
Collapse
|
57
|
Peng J, Zhang X, Hui W, Lu J, Li Q, Liu S, Shang X. Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach. BMC SYSTEMS BIOLOGY 2018; 12:18. [PMID: 29560823 PMCID: PMC5861498 DOI: 10.1186/s12918-018-0539-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
BACKGROUND Gene Ontology (GO) is one of the most popular bioinformatics resources. In the past decade, Gene Ontology-based gene semantic similarity has been effectively used to model gene-to-gene interactions in multiple research areas. However, most existing semantic similarity approaches rely only on GO annotations and structure, or incorporate only local interactions in the co-functional network. This may lead to inaccurate GO-based similarity resulting from the incomplete GO topology structure and gene annotations. RESULTS We present NETSIM2, a new network-based method that allows researchers to measure GO-based gene functional similarities by considering the global structure of the co-functional network with a random walk with restart (RWR)-based method, and by selecting the significant term pairs to decrease the noise information. Based on the EC number (Enzyme Commission)-based groups of yeast and Arabidopsis, evaluation test shows that NETSIM2 can enhance the accuracy of Gene Ontology-based gene functional similarity. CONCLUSIONS Using NETSIM2 as an example, we found that the accuracy of semantic similarities can be significantly improved after effectively incorporating the global gene-to-gene interactions in the co-functional network, especially on the species that gene annotations in GO are far from complete.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China. .,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, China. .,Centre for Multidisciplinary Convergence Computing (CMCC), School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| | - Xuanshuo Zhang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Weiwei Hui
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Junya Lu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Qianqian Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Shuhui Liu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, China
| |
Collapse
|
58
|
Hu Y, Zhou M, Shi H, Ju H, Jiang Q, Cheng L. Measuring disease similarity and predicting disease-related ncRNAs by a novel method. BMC Med Genomics 2017; 10:71. [PMID: 29297338 PMCID: PMC5751624 DOI: 10.1186/s12920-017-0315-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Background Similar diseases are always caused by similar molecular origins, such as diasease-related protein-coding genes (PCGs). And the molecular associations reflect their similarity. Therefore, current methods for calculating disease similarity often utilized functional interactions of PCGs. Besides, the existing methods have neglected a fact that genes could also be associated in the gene functional network (GFN) based on intermediate nodes. Methods Here we presented a novel method, InfDisSim, to deduce the similarity of diseases. InfDisSim utilized the whole network based on random walk with damping to model the information flow. A benchmark set of similar disease pairs was employed to evaluate the performance of InfDisSim. Results The region beneath the receiver operating characteristic curve (AUC) was calculated to assess the performance. As a result, InfDisSim reaches a high AUC (0.9786) which indicates a very good performance. Furthermore, after calculating the disease similarity by the InfDisSim, we reconfirmed that similar diseases tend to have common therapeutic drugs (Pearson correlation γ2 = 0.1315, p = 2.2e-16). Finally, the disease similarity computed by infDisSim was employed to construct a miRNA similarity network (MSN) and lncRNA similarity network (LSN), which were further exploited to predict potential associations of lncRNA-disease pairs and miRNA-disease pairs, respectively. High AUC (0.9893, 0.9007) based on leave-one-out cross validation shows that the LSN and MSN is very appropriate for predicting novel disease-related lncRNAs and miRNAs, respectively. Conclusions The high AUC based on benchmark data indicates the method performs well. The method is valuable in the prediction of disease-related lncRNAs and miRNAs. Electronic supplementary material The online version of this article (doi: 10.1186/s12920-017-0315-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yang Hu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Meng Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150001, China
| | - Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150001, China
| | - Hong Ju
- Department of information engineering, Heilongjiang biological science and technology Career Academy, Harbin, 150001, China
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China.
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150001, China.
| |
Collapse
|
59
|
Zhang D, Liu L, Pang L, Jin Q, Ke K, Hu M, Yang J, Ma W, Xie H, Chen X. Biological evaluation and energetic analyses of novel GSK‐3
β
inhibitors. J Cell Biochem 2017; 119:3510-3518. [DOI: 10.1002/jcb.26522] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 11/10/2017] [Indexed: 11/09/2022]
Affiliation(s)
- Denan Zhang
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Lei Liu
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Lin Pang
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Qing Jin
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Kehui Ke
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Ming Hu
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Jingbo Yang
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Weifang Ma
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Hongbo Xie
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| | - Xiujie Chen
- Department of PharmacogenomicsCollege of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbinP. R. China
| |
Collapse
|
60
|
A framework for exploring associations between biomedical terms in PubMed. Oncotarget 2017; 8:103100-103107. [PMID: 29262548 PMCID: PMC5732714 DOI: 10.18632/oncotarget.21532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 09/08/2017] [Indexed: 11/25/2022] Open
Abstract
Co-occurrence relationships in PubMed between terms accelerate the recognition of term associations. The lack of manually curated relationships in vocabularies and the rapid increase of biomedical literatures highlight the importance of co-occurrence relationships. Here we proposed a framework to explore term associations based on a standard procedure that comprises multiple tools of text mining and relationship degree calculation methods. The text of PubMed were segmented into sentences by Apache OpenNLP first, and then terms of sentences were recognized by MGREP. After that two terms occurring in a common sentence were identified as a co-occurrence relationship. The relationship degree is then calculated using Normalized MEDLINE Distance (NMD) or relationship-scaled score (RSS) method. The framework was utilized in exploring associations between terms of Gene Ontology (GO) and Disease Ontology (DO) based on co-occurrence relationship. Results show that pairs of terms with more co-occurrence relationships indicate shared more semantic relationships of ontology and genes. The identified association terms based on co-occurrence relationships were applied in constructing a disease association network (DAN). The small giant component confirms with the observation that diseases in the same class have more linkage than diseases in different classes.
Collapse
|
61
|
Xie H, Zeng D, Chen X, Huo D, Liu L, Zhang D, Jin Q, Ke K, Hu M. Prediction on the risk population of idiosyncratic adverse reactions based on molecular docking with mutant proteins. Oncotarget 2017; 8:95568-95576. [PMID: 29221149 PMCID: PMC5707043 DOI: 10.18632/oncotarget.21509] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 09/20/2017] [Indexed: 01/11/2023] Open
Abstract
Idiosyncratic adverse drug reactions are drug reactions that occur rarely and unpredictably among the population. These reactions often occur after a drug is marketed, which means that they are strongly related to the genotype of the population. The prediction of such adverse reactions is a major challenge because of the lack of appropriate test models during the drug development process. In this study, we chose withdrawn drugs because the reasons why they were withdrawn and from which countries or regions is easily obtained. We selected Dilevalol and its chiral drug (Labetalol) as the investigatory drugs, as they have been withdrawn from a European market (Britain) because of serious hepatotoxicity. First, we searched for and obtained the Dilevalol-induced- liver-injury related protein, multidrug resistance protein 1 (MDR1), from the Comparative Toxicogenomics Database (CTD). Then, we searched and extracted 477 non-synonymous single nucleotide polymorphisms (nsSNP) on MDR1 in the dbSNP database. Second, we used the VarMod tool to predict the functional changes of MDR1 induced by these nsSNPs, from which we extracted the nsSNPs that significantly change the functions of this protein. Third, we built the three-dimensional structures of those variant proteins and used AutoDock to perform a docking study, choosing the best model to determine the sites of nsSNPs. Finally, we used the data from the 1000 Genomes Project to verify the dominant population distribution of the risk SNP. We applied the same strategy to the post-marketing drug-induced liver injury drugs to further test the feasibility of our method.
Collapse
Affiliation(s)
- Hongbo Xie
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Diheng Zeng
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Xiujie Chen
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Diwei Huo
- The 2nd Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Lei Liu
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Denan Zhang
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Qing Jin
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Kehui Ke
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Ming Hu
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| |
Collapse
|
62
|
Hu Y, Zhao L, Liu Z, Ju H, Shi H, Xu P, Wang Y, Cheng L. DisSetSim: an online system for calculating similarity between disease sets. J Biomed Semantics 2017; 8:28. [PMID: 29297411 PMCID: PMC5763469 DOI: 10.1186/s13326-017-0140-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Background Functional similarity between molecules results in similar phenotypes, such as diseases. Therefore, it is an effective way to reveal the function of molecules based on their induced diseases. However, the lack of a tool for obtaining the similarity score of pair-wise disease sets (SSDS) limits this type of application. Results Here, we introduce DisSetSim, an online system to solve this problem in this article. Five state-of-the-art methods involving Resnik’s, Lin’s, Wang’s, PSB, and SemFunSim methods were implemented to measure the similarity score of pair-wise diseases (SSD) first. And then “pair-wise-best pairs-average” (PWBPA) method was implemented to calculated the SSDS by the SSD. The system was applied for calculating the functional similarity of miRNAs based on their induced disease sets. The results were further used to predict potential disease-miRNA relationships. Conclusions The high area under the receiver operating characteristic curve AUC (0.9296) based on leave-one-out cross validation shows that the PWBPA method achieves a high true positive rate and a low false positive rate. The system can be accessed from http://www.bio-annotation.cn:8080/DisSetSim/.
Collapse
Affiliation(s)
- Yang Hu
- Harbin Institute of Technology, School of Life Science and Technology, Harbin, 150001, People's Republic of China
| | - Lingling Zhao
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Zhiyan Liu
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Hong Ju
- Department of information engineering, Heilongjiang Biological Science and Technology Career Academy, Harbin, 150001, People's Republic of China
| | - Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150001, People's Republic of China
| | - Peigang Xu
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Yadong Wang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China.
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150001, People's Republic of China.
| |
Collapse
|
63
|
Peng J, Li Q, Shang X. Investigations on factors influencing HPO-based semantic similarity calculation. J Biomed Semantics 2017; 8:34. [PMID: 29297376 PMCID: PMC5763495 DOI: 10.1186/s13326-017-0144-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background Although disease diagnosis has greatly benefited from next generation sequencing technologies, it is still difficult to make the right diagnosis purely based on sequencing technologies for many diseases with complex phenotypes and high genetic heterogeneity. Recently, calculating Human Phenotype Ontology (HPO)-based phenotype semantic similarity has contributed a lot for completing disease diagnosis. However, factors which affect the accuracy of HPO-based semantic similarity have not been evaluated systematically. Results In this study, we proposed a new framework called HPOFactor to evaluate these factors. Our model includes four components: (1) the size of annotation set, (2) the evidence code of annotations, (3) the quality of annotations and (4) the coverage of annotations respectively. Conclusions HPOFactor analyzes the four factors systematically based on two kinds of experiments: causative gene prediction and disease prediction. Furthermore, semantic similarity measurement could be designed based on the characteristic of these factors.
Collapse
Affiliation(s)
- Jiajie Peng
- Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China
| | - Qianqian Li
- Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China
| | - Xuequn Shang
- Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| |
Collapse
|
64
|
Han Y, Sun W, Sun G, Hou X, Gong Z, Xu J, Bai X, Fu L. A 3-year observation of testosterone deficiency in Chinese patients with chronic heart failure. Oncotarget 2017; 8:79835-79842. [PMID: 29108365 PMCID: PMC5668098 DOI: 10.18632/oncotarget.19816] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 07/12/2017] [Indexed: 12/11/2022] Open
Abstract
Testosterone deficiency is present in a certain proportion men with chronic heart failure (CHF). Low testosterone levels in American and European patients with CHF lead to the high mortality and readmission rates. Interestingly, this relationship has not been studied in Chinese patients. To this end, 167 Chinese men with CHF underwent clinical and laboratory evaluations associated with determinations of testosterone levels. Total testosterone (TT) levels and sex hormone-binding globulin were measured by chemiluminescence or immunoassays assays and free testosterone (FT) levels were calculated, Based upon results from these assays, patients were divided into either a low testosterone (LT; n = 93) or normal testosterone (NT; n = 74) group. Subsequently, records from each patient were reviewed over a follow-up duration of at least 3 years. Patients in the LT group experienced worse cardiac function and a higher prevalence of etiology (ischemic vs. no ischemic) and comorbidity (both P < 0.05). In addition, readmission rates of patients in the LT group were higher than that of patients in the NT group (3.32 ± 1.66 VS 1.57 ± 0.89). Overall, deficiencies in FT levels were accompanied with increased mortalities (HR = 6.301, 95% CI 3.187–12.459, P < .0001).
Collapse
Affiliation(s)
- Ying Han
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Weiju Sun
- Cardiovascular Department, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Guizhi Sun
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Xiaolu Hou
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Zhaowei Gong
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Jing Xu
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Xiuping Bai
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Lu Fu
- Cardiovascular Department, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| |
Collapse
|