Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gong X, Jiang J, Duan Z, Lu H. A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology. BMC Bioinformatics 2018;19:162. [PMID: 29745853 PMCID: PMC5998886 DOI: 10.1186/s12859-018-2064-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

For:	Gong X, Jiang J, Duan Z, Lu H. A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology. BMC Bioinformatics 2018;19:162. [PMID: 29745853 PMCID: PMC5998886 DOI: 10.1186/s12859-018-2064-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Number

Cited by Other Article(s)

Huang D, Jiang J, Zhao T, Wu S, Li P, Lyu Y, Feng J, Wei M, Zhu Z, Gu J, Ren Y, Yu G, Lu H. diseaseGPS: auxiliary diagnostic system for genetic disorders based on genotype and phenotype. Bioinformatics 2023;39:btad517. [PMID: 37647638 PMCID: PMC10500091 DOI: 10.1093/bioinformatics/btad517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 07/24/2023] [Accepted: 08/29/2023] [Indexed: 09/01/2023] Open

Affiliation(s)

Daoyi Huang State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China SJTU-Yale Joint Center for Biostatistics and Data Science, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
Jianping Jiang State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China SJTU-Yale Joint Center for Biostatistics and Data Science, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Tingting Zhao Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China Shanghai Engineering Research Center for Big Data in Pediatric Precision Medicine, Shanghai, China
Shengnan Wu Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Pin Li Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Yongfen Lyu Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Jincai Feng Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Mingyue Wei Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Zhixing Zhu Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China Shanghai Engineering Research Center for Big Data in Pediatric Precision Medicine, Shanghai, China
Jianlei Gu State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China SJTU-Yale Joint Center for Biostatistics and Data Science, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
Yongyong Ren State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China SJTU-Yale Joint Center for Biostatistics and Data Science, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
Guangjun Yu Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China Shanghai Engineering Research Center for Big Data in Pediatric Precision Medicine, Shanghai, China School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong, China
Hui Lu State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China SJTU-Yale Joint Center for Biostatistics and Data Science, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China

Collapse

Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbe-disease associations via graph attention networks with inductive matrix completion. Brief Bioinform 2020;22:5876591. [PMID: 32725163 DOI: 10.1093/bib/bbaa146] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 06/07/2020] [Accepted: 06/11/2020] [Indexed: 12/13/2022] Open

Abstract

MOTIVATION

human microbes play a critical role in an extensive range of complex human diseases and become a new target in precision medicine. In silico methods of identifying microbe-disease associations not only can provide a deep insight into understanding the pathogenic mechanism of complex human diseases but also assist pharmacologists to screen candidate targets for drug development. However, the majority of existing approaches are based on linear models or label propagation, which suffers from limitations in capturing nonlinear associations between microbes and diseases. Besides, it is still a great challenge for most previous methods to make predictions for new diseases (or new microbes) with few or without any observed associations.

RESULTS

in this work, we construct features for microbes and diseases by fully exploiting multiply sources of biomedical data, and then propose a novel deep learning framework of graph attention networks with inductive matrix completion for human microbe-disease association prediction, named GATMDA. To our knowledge, this is the first attempt to leverage graph attention networks for this important task. In particular, we develop an optimized graph attention network with talking-heads to learn representations for nodes (i.e. microbes and diseases). To focus on more important neighbours and filter out noises, we further design a bi-interaction aggregator to enforce representation aggregation of similar neighbours. In addition, we combine inductive matrix completion to reconstruct microbe-disease associations to capture the complicated associations between diseases and microbes. Comprehensive experiments on two data sets (i.e. HMDAD and Disbiome) demonstrated that our proposed model consistently outperformed baseline methods. Case studies on two diseases, i.e. asthma and inflammatory bowel disease, further confirmed the effectiveness of our proposed model of GATMDA.

AVAILABILITY

python codes and data set are available at: https://github.com/yahuilong/GATMDA.

CONTACT

luojiawei@hnu.edu.cn.

Collapse

Cardoso C, Sousa RT, Köhler S, Pesquita C. A Collection of Benchmark Data Sets for Knowledge Graph-based Similarity in the Biomedical Domain. Database (Oxford) 2020;2020:baaa078. [PMID: 33181823 PMCID: PMC7661097 DOI: 10.1093/database/baaa078] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/13/2020] [Accepted: 08/24/2020] [Indexed: 01/12/2023]

Wang W, Langlois R, Langlois M, Genchev GZ, Wang X, Lu H. Functional Site Discovery From Incomplete Training Data: A Case Study With Nucleic Acid-Binding Proteins. Front Genet 2019;10:729. [PMID: 31543893 PMCID: PMC6729729 DOI: 10.3389/fgene.2019.00729] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 07/11/2019] [Indexed: 12/27/2022] Open

Shen F, Peng S, Fan Y, Wen A, Liu S, Wang Y, Wang L, Liu H. HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology. J Biomed Inform 2019;96:103246. [PMID: 31255713 DOI: 10.1016/j.jbi.2019.103246] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 06/25/2019] [Accepted: 06/26/2019] [Indexed: 11/25/2022]

Abstract

BACKGROUND

In precision medicine, deep phenotyping is defined as the precise and comprehensive analysis of phenotypic abnormalities, aiming to acquire a better understanding of the natural history of a disease and its genotype-phenotype associations. Detecting phenotypic relevance is an important task when translating precision medicine into clinical practice, especially for patient stratification tasks based on deep phenotyping. In our previous work, we developed node embeddings for the Human Phenotype Ontology (HPO) to assist in phenotypic relevance measurement incorporating distributed semantic representations. However, the derived HPO embeddings hold only distributed representations for IS-A relationships among nodes, hampering the ability to fully explore the graph.

METHODS

In this study, we developed a framework, HPO2Vec+, to enrich the produced HPO embeddings with heterogeneous knowledge resources (i.e., DECIPHER, OMIM, and Orphanet) for detecting phenotypic relevance. Specifically, we parsed disease-phenotype associations contained in these three resources to enrich non-inheritance relationships among phenotypic nodes in the HPO. To generate node embeddings for the HPO, node2vec was applied to perform node sampling on the enriched HPO graphs based on random walk followed by feature learning over the sampled nodes to generate enriched node embeddings. Four HPO embeddings were generated based on different graph structures, which we hereafter label as HPOEmb-Original, HPOEmb-DECIPHER, HPOEmb-OMIM, and HPOEmb-Orphanet. We evaluated the derived embeddings quantitatively through an HPO link prediction task with four edge embeddings operations and six machine learning algorithms. The resulting best embeddings were then evaluated for patient stratification of 10 rare diseases using electronic health records (EHR) collected at Mayo Clinic. We assessed our framework qualitatively by visualizing phenotypic clusters and conducting a use case study on primary hyperoxaluria (PH), a rare disease, on the task of inferring relevant phenotypes given 22 annotated PH related phenotypes.

RESULTS

The quantitative link prediction task shows that HPOEmb-Orphanet achieved an optimal AUROC of 0.92 and an average precision of 0.94. In addition, HPOEmb-Orphanet achieved an optimal F1 score of 0.86. The quantitative patient similarity measurement task indicates that HPOEmb-Orphanet achieved the highest average detection rate for similar patients over 10 rare diseases and performed better than other similarity measures implemented by an existing tool, HPOSim, especially for pairwise patients with fewer shared common phenotypes. The qualitative evaluation shows that the enriched HPO embeddings are generally able to detect relationships among nodes with fine granularity and HPOEmb-Orphanet is particularly good at associating phenotypes across different disease systems. For the use case of detecting relevant phenotypic characterizations for given PH related phenotypes, HPOEmb-Orphanet outperformed the other three HPO embeddings by achieving the highest average P@5 of 0.81 and the highest P@10 of 0.79. Compared to seven conventional similarity measurements provided by HPOSim, HPOEmb-Orphanet is able to detect more relevant phenotypic pairs, especially for pairs not in inheritance relationships.

CONCLUSION

We drew the following conclusions based on the evaluation results. First, with additional non-inheritance edges, enriched HPO embeddings can detect more associations between fine granularity phenotypic nodes regardless of their topological structures in the HPO graph. Second, HPOEmb-Orphanet not only can achieve the optimal performance through link prediction and patient stratification based on phenotypic similarity, but is also able to detect relevant phenotypes closer to domain expert's judgments than other embeddings and conventional similarity measurements. Third, incorporating heterogeneous knowledge resources do not necessarily result in better performance for detecting relevant phenotypes. From a clinical perspective, in our use case study, clinical-oriented knowledge resources (e.g., Orphanet) can achieve better performance in detecting relevant phenotypic characterizations compared to biomedical-oriented knowledge resources (e.g., DECIPHER and OMIM).

Collapse

Jiang J, Gu J, Zhao T, Lu H. VCF-Server: A web-based visualization tool for high-throughput variant data mining and management. Mol Genet Genomic Med 2019;7:e00641. [PMID: 31127704 PMCID: PMC6625089 DOI: 10.1002/mgg3.641] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 01/25/2019] [Accepted: 02/20/2019] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

Next-generation sequencing (NGS) has been widely used in both clinics and research. It has become the most powerful tool for diagnosing genetic disorders and investigating disease etiology through the discovery of genetic variants. Variants identified by NGS are stored in variant call format (VCF) files. However, querying and filtering VCF files are extremely difficult for researchers without programming skills. Furthermore, as the mutation data are increasing exponentially, there is an urgent need to develop tools to manage these variant data in a centralized way.

METHODS

The VCF-Server was developed as a web-based visualization tool to support the interactive analysis of genetic variant data. It allows researchers and medical geneticists to manage, annotate, filter, query, and export variants in a fast and effective way.

RESULTS

In this study, we developed the VCF-Server, a powerful and easily accessible tool for researchers and medical geneticists to perform variant analysis. Users can query VCFs, annotate, and filter variants without knowing programming code. Once the VCF file is uploaded, VCF-Server allows users to annotate the VCF with commonly used databases or user-defined variant annotations (including variant blacklist and whitelist). Variant information in the VCF is shown visually via the interactive graphical interface. Users can filter the variants with flexible filtering rules, and the prioritized variants can be exported locally for further analysis. As VCF-Server adopts a web file system, files in the VCF-Server can be stored and managed in a centralized way. Moreover, VCF-Server allows direct web-based analysis (accessible through either desktop computers or mobile devices) as well as local deployment.

CONCLUSIONS

With an easy-to-use graphical interface, VCF-Server allows researchers with little bioinformatics background to explore and mine mutation data, which may broaden the application of NGS technology in clinics and research. The tool is freely available for use at https://www.diseasegps.org/VCF-Server?lan = eng.

Collapse