Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Xing W, Qi J, Yuan X, Li L, Zhang X, Fu Y, Xiong S, Hu L, Peng J. A gene-phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach. Bioinformatics 2018;34:i386-i394. [PMID: 29950017 PMCID: PMC6022650 DOI: 10.1093/bioinformatics/bty263] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

For:	Xing W, Qi J, Yuan X, Li L, Zhang X, Fu Y, Xiong S, Hu L, Peng J. A gene-phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach. Bioinformatics 2018;34:i386-i394. [PMID: 29950017 PMCID: PMC6022650 DOI: 10.1093/bioinformatics/bty263] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Number

Cited by Other Article(s)

Gao Y, Zhou Q, Luo J, Xia C, Zhang Y, Yue Z. Crop-GPA: an integrated platform of crop gene-phenotype associations. NPJ Syst Biol Appl 2024;10:15. [PMID: 38346982 PMCID: PMC10861494 DOI: 10.1038/s41540-024-00343-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 01/22/2024] [Indexed: 02/15/2024] Open

Collins C, Baker S, Brown J, Zheng H, Chan A, Stenius U, Narita M, Korhonen A. Text mining for contexts and relationships in cancer genomics literature. Bioinformatics 2024;40:btae021. [PMID: 38258418 PMCID: PMC10822582 DOI: 10.1093/bioinformatics/btae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 09/27/2023] [Accepted: 01/15/2024] [Indexed: 01/24/2024] Open

Zeibich R, Kwan P, J. O’Brien T, Perucca P, Ge Z, Anderson A. Applications for Deep Learning in Epilepsy Genetic Research. Int J Mol Sci 2023;24:14645. [PMID: 37834093 PMCID: PMC10572791 DOI: 10.3390/ijms241914645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 09/11/2023] [Accepted: 09/21/2023] [Indexed: 10/15/2023] Open

Affiliation(s)

Robert Zeibich Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.)
Patrick Kwan Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
Terence J. O’Brien Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia
Piero Perucca Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Neurology, Alfred Health, Melbourne, VIC 3004, Australia Department of Neurology, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia Epilepsy Research Centre, Department of Medicine, Austin Health, The University of Melbourne, Melbourne, VIC 3084, Australia Bladin-Berkovic Comprehensive Epilepsy Program, Department of Neurology, Austin Health, The University of Melbourne, Melbourne, VIC 3084, Australia
Zongyuan Ge Faculty of Engineering, Monash University, Melbourne, VIC 3800, Australia; Monash-Airdoc Research, Monash University, Melbourne, VIC 3800, Australia
Alison Anderson Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC 3800, Australia; (R.Z.); (P.K.); (T.J.O.); (P.P.) Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, VIC 3052, Australia

Collapse

Li X, Yuan H, Wu X, Wang C, Wu M, Shi H, Lv Y. MultiDS-MDA: Integrating multiple data sources into heterogeneous network for predicting novel metabolite-drug associations. Comput Biol Med 2023;162:107067. [PMID: 37276756 DOI: 10.1016/j.compbiomed.2023.107067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/07/2023]

Jadhav A, Kumar T, Raghavendra M, Loganathan T, Narayanan M. Predicting cross-tissue hormone-gene relations using balanced word embeddings. Bioinformatics 2022;38:4771-4781. [PMID: 36000859 PMCID: PMC9563690 DOI: 10.1093/bioinformatics/btac578] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/29/2022] [Accepted: 08/23/2022] [Indexed: 11/12/2022] Open

Feng B, Gao J. AnthraxKP: a knowledge graph-based, Anthrax Knowledge Portal mined from biomedical literature. Database (Oxford) 2022;2022:6598946. [PMID: 35653350 PMCID: PMC9216567 DOI: 10.1093/database/baac037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 04/13/2022] [Accepted: 05/13/2022] [Indexed: 11/15/2022]

Krämer A, Green J, Billaud JN, Pasare NA, Jones M, Tugendreich S. Mining hidden knowledge: embedding models of cause-effect relationships curated from the biomedical literature. BIOINFORMATICS ADVANCES 2022;2:vbac022. [PMID: 36699407 PMCID: PMC9710590 DOI: 10.1093/bioadv/vbac022] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 03/04/2022] [Accepted: 04/06/2022] [Indexed: 01/28/2023]

Artificial Intelligence and Cardiovascular Genetics. Life (Basel) 2022;12:life12020279. [PMID: 35207566 PMCID: PMC8875522 DOI: 10.3390/life12020279] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/26/2022] [Accepted: 02/09/2022] [Indexed: 12/13/2022] Open

Birgmeier J, Haeussler M, Deisseroth CA, Steinberg EH, Jagadeesh KA, Ratner AJ, Guturu H, Wenger AM, Diekhans ME, Stenson PD, Cooper DN, Ré C, Beggs AH, Bernstein JA, Bejerano G. AMELIE speeds Mendelian diagnosis by matching patient phenotype and genotype to primary literature. Sci Transl Med 2021;12:12/544/eaau9113. [PMID: 32434849 DOI: 10.1126/scitranslmed.aau9113] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 08/14/2019] [Accepted: 04/22/2020] [Indexed: 12/21/2022]

Zhao S, Su C, Lu Z, Wang F. Recent advances in biomedical literature mining. Brief Bioinform 2021;22:bbaa057. [PMID: 32422651 PMCID: PMC8138828 DOI: 10.1093/bib/bbaa057] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 03/22/2020] [Accepted: 03/25/2020] [Indexed: 01/26/2023] Open

Identifying protein subcellular localisation in scientific literature using bidirectional deep recurrent neural network. Sci Rep 2021;11:1696. [PMID: 33462256 PMCID: PMC7813825 DOI: 10.1038/s41598-020-80441-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 12/17/2020] [Indexed: 11/17/2022] Open

Fu Y, Xu J, Tang Z, Wang L, Yin D, Fan Y, Zhang D, Deng F, Zhang Y, Zhang H, Wang H, Xing W, Yin L, Zhu S, Zhu M, Yu M, Li X, Liu X, Yuan X, Zhao S. A gene prioritization method based on a swine multi-omics knowledgebase and a deep learning model. Commun Biol 2020;3:502. [PMID: 32913254 PMCID: PMC7483748 DOI: 10.1038/s42003-020-01233-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/07/2020] [Indexed: 12/27/2022] Open

Affiliation(s)

Yuhua Fu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.,School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Jingya Xu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Zhenshuang Tang Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Lu Wang Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Dong Yin Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Yu Fan Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Dongdong Zhang School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Fei Deng School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Yanping Zhang School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Haohao Zhang School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Haiyan Wang Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Wenhui Xing School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Lilin Yin Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Shilin Zhu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Mengjin Zhu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Mei Yu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Xinyun Li Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Xiaolei Liu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.
Xiaohui Yuan School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China.
Shuhong Zhao Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.

Collapse

Perera N, Dehmer M, Emmert-Streib F. Named Entity Recognition and Relation Detection for Biomedical Information Extraction. Front Cell Dev Biol 2020;8:673. [PMID: 32984300 PMCID: PMC7485218 DOI: 10.3389/fcell.2020.00673] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 07/02/2020] [Indexed: 12/29/2022] Open

Ju M, Short AD, Thompson P, Bakerly ND, Gkoutos GV, Tsaprouni L, Ananiadou S. Annotating and detecting phenotypic information for chronic obstructive pulmonary disease. JAMIA Open 2020;2:261-271. [PMID: 31984360 PMCID: PMC6951876 DOI: 10.1093/jamiaopen/ooz009] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 02/21/2019] [Accepted: 03/19/2019] [Indexed: 12/29/2022] Open

Abstract

Objectives

Chronic obstructive pulmonary disease (COPD) phenotypes cover a range of lung abnormalities. To allow text mining methods to identify pertinent and potentially complex information about these phenotypes from textual data, we have developed a novel annotated corpus, which we use to train a neural network-based named entity recognizer to detect fine-grained COPD phenotypic information.

Materials and methods

Since COPD phenotype descriptions often mention other concepts within them (proteins, treatments, etc.), our corpus annotations include both outermost phenotype descriptions and concepts nested within them. Our neural layered bidirectional long short-term memory conditional random field (BiLSTM-CRF) network firstly recognizes nested mentions, which are fed into subsequent BiLSTM-CRF layers, to help to recognize enclosing phenotype mentions.

Results

Our corpus of 30 full papers (available at: http://www.nactem.ac.uk/COPD) is annotated by experts with 27 030 phenotype-related concept mentions, most of which are automatically linked to UMLS Metathesaurus concepts. When trained using the corpus, our BiLSTM-CRF network outperforms other popular approaches in recognizing detailed phenotypic information.

Discussion

Information extracted by our method can facilitate efficient location and exploration of detailed information about phenotypes, for example, those specifically concerning reactions to treatments.

Conclusion

The importance of our corpus for developing methods to extract fine-grained information about COPD phenotypes is demonstrated through its successful use to train a layered BiLSTM-CRF network to extract phenotypic information at various levels of granularity. The minimal human intervention needed for training should permit ready adaption to extracting phenotypic information about other diseases.

Collapse

Braun IR, Lawrence-Dill CJ. Automated Methods Enable Direct Computation on Phenotypic Descriptions for Novel Candidate Gene Prediction. FRONTIERS IN PLANT SCIENCE 2020;10:1629. [PMID: 31998331 PMCID: PMC6965352 DOI: 10.3389/fpls.2019.01629] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 11/19/2019] [Indexed: 06/01/2023]

Xu B, Liu Y, Yu S, Wang L, Dong J, Lin H, Yang Z, Wang J, Xia F. A network embedding model for pathogenic genes prediction by multi-path random walking on heterogeneous network. BMC Med Genomics 2019;12:188. [PMID: 31865919 PMCID: PMC6927107 DOI: 10.1186/s12920-019-0627-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Heo GE, Xie Q, Song M, Lee JH. Combining entity co-occurrence with specialized word embeddings to measure entity relation in Alzheimer's disease. BMC Med Inform Decis Mak 2019;19:240. [PMID: 31801521 PMCID: PMC6894106 DOI: 10.1186/s12911-019-0934-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open

Tsueng G, Nanis M, Fouquier JT, Mayers M, Good BM, Su AI. Applying citizen science to gene, drug and disease relationship extraction from biomedical abstracts. Bioinformatics 2019;36:1226-1233. [PMID: 31504205 PMCID: PMC8104067 DOI: 10.1093/bioinformatics/btz678] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 08/05/2019] [Accepted: 08/29/2019] [Indexed: 01/31/2023] Open

Abstract

MOTIVATION

Biomedical literature is growing at a rate that outpaces our ability to harness the knowledge contained therein. To mine valuable inferences from the large volume of literature, many researchers use information extraction algorithms to harvest information in biomedical texts. Information extraction is usually accomplished via a combination of manual expert curation and computational methods. Advances in computational methods usually depend on the time-consuming generation of gold standards by a limited number of expert curators. Citizen science is public participation in scientific research. We previously found that citizen scientists are willing and capable of performing named entity recognition of disease mentions in biomedical abstracts, but did not know if this was true with relationship extraction (RE).

RESULTS

In this article, we introduce the Relationship Extraction Module of the web-based application Mark2Cure (M2C) and demonstrate that citizen scientists can perform RE. We confirm the importance of accurate named entity recognition on user performance of RE and identify design issues that impacted data quality. We find that the data generated by citizen scientists can be used to identify relationship types not currently available in the M2C Relationship Extraction Module. We compare the citizen science-generated data with algorithm-mined data and identify ways in which the two approaches may complement one another. We also discuss opportunities for future improvement of this system, as well as the potential synergies between citizen science, manual biocuration and natural language processing.

AVAILABILITY AND IMPLEMENTATION

Mark2Cure platform: https://mark2cure.org; Mark2Cure source code: https://github.com/sulab/mark2cure; and data and analysis code for this article: https://github.com/gtsueng/M2C_rel_nb.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Kafkas Ş, Hoehndorf R. Ontology based text mining of gene-phenotype associations: application to candidate gene prediction. Database (Oxford) 2019;2019:baz019. [PMID: 30809638 PMCID: PMC6391585 DOI: 10.1093/database/baz019] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 01/09/2019] [Accepted: 01/26/2019] [Indexed: 01/07/2023]

Abstract

Gene-phenotype associations play an important role in understanding the disease mechanisms which is a requirement for treatment development. A portion of gene-phenotype associations are observed mainly experimentally and made publicly available through several standard resources such as MGI. However, there is still a vast amount of gene-phenotype associations buried in the biomedical literature. Given the large amount of literature data, we need automated text mining tools to alleviate the burden in manual curation of gene-phenotype associations and to develop comprehensive resources. In this study, we present an ontology-based approach in combination with statistical methods to text mine gene-phenotype associations from the literature. Our method achieved AUC values of 0.90 and 0.75 in recovering known gene-phenotype associations from HPO and MGI respectively. We posit that candidate genes and their relevant diseases should be expressed with similar phenotypes in publications. Thus, we demonstrate the utility of our approach by predicting disease candidate genes based on the semantic similarities of phenotypes associated with genes and diseases. To the best of our knowledge, this is the first study using an ontology based approach to extract gene-phenotype associations from the literature. We evaluated our disease candidate prediction model on the gene-disease associations from MGI. Our model achieved AUC values of 0.90 and 0.87 on OMIM (human) and MGI (mouse) datasets of gene-disease associations respectively. Our manual analysis on the text mined data revealed that our method can accurately extract gene-phenotype associations which are not currently covered by the existing public gene-phenotype resources. Overall, results indicate that our method can precisely extract known as well as new gene-phenotype associations from literature. All the data and methods are available at https://github.com/bio-ontology-research-group/genepheno.

Collapse

Acharya A, Li S, Liu X, Pelekos G, Ziebolz D, Mattheos N. Biological links in periodontitis and rheumatoid arthritis: Discovery via text‐mining PubMed abstracts. J Periodontal Res 2018;54:318-328. [DOI: 10.1111/jre.12632] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 11/01/2018] [Accepted: 11/18/2018] [Indexed: 12/18/2022]