1
|
Ming S, Zhang R, Kilicoglu H. Enhancing the coverage of SemRep using a relation classification approach. J Biomed Inform 2024; 155:104658. [PMID: 38782169 DOI: 10.1016/j.jbi.2024.104658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 05/01/2024] [Accepted: 05/18/2024] [Indexed: 05/25/2024]
Abstract
OBJECTIVE Relation extraction is an essential task in the field of biomedical literature mining and offers significant benefits for various downstream applications, including database curation, drug repurposing, and literature-based discovery. The broad-coverage natural language processing (NLP) tool SemRep has established a solid baseline for extracting subject-predicate-object triples from biomedical text and has served as the backbone of the Semantic MEDLINE Database (SemMedDB), a PubMed-scale repository of semantic triples. While SemRep achieves reasonable precision (0.69), its recall is relatively low (0.42). In this study, we aimed to enhance SemRep using a relation classification approach, in order to eventually increase the size and the utility of SemMedDB. METHODS We combined and extended existing SemRep evaluation datasets to generate training data. We leveraged the pre-trained PubMedBERT model, enhancing it through additional contrastive pre-training and fine-tuning. We experimented with three entity representations: mentions, semantic types, and semantic groups. We evaluated the model performance on a portion of the SemRep Gold Standard dataset and compared it to SemRep performance. We also assessed the effect of the model on a larger set of 12K randomly selected PubMed abstracts. RESULTS Our results show that the best model yields a precision of 0.62, recall of 0.81, and F1 score of 0.70. Assessment on 12K abstracts shows that the model could double the size of SemMedDB, when applied to entire PubMed. We also manually assessed the quality of 506 triples predicted by the model that SemRep had not previously identified, and found that 67% of these triples were correct. CONCLUSION These findings underscore the promise of our model in achieving a more comprehensive coverage of relationships mentioned in biomedical literature, thereby showing its potential in enhancing various downstream applications of biomedical literature mining. Data and code related to this study are available at https://github.com/Michelle-Mings/SemRep_RelationClassification.
Collapse
Affiliation(s)
- Shufan Ming
- School of Information Sciences, University of Illinois Urbana-Champaign, 501 E Daniel St., Champaign, 61820, IL, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, 516 Delaware St SE, Minneapolis, 55455, MN, USA
| | - Halil Kilicoglu
- School of Information Sciences, University of Illinois Urbana-Champaign, 501 E Daniel St., Champaign, 61820, IL, USA.
| |
Collapse
|
2
|
Akanyibah FA, Zhu Y, Wan A, Ocansey DKW, Xia Y, Fang AN, Mao F. Effects of DNA methylation and its application in inflammatory bowel disease (Review). Int J Mol Med 2024; 53:55. [PMID: 38695222 DOI: 10.3892/ijmm.2024.5379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 04/15/2024] [Indexed: 05/12/2024] Open
Abstract
Inflammatory bowel disease (IBD) is marked by persistent inflammation, and its development and progression are linked to environmental, genetic, immune system and gut microbial factors. DNA methylation (DNAm), as one of the protein modifications, is a crucial epigenetic process used by cells to control gene transcription. DNAm is one of the most common areas that has drawn increasing attention recently, with studies revealing that the interleukin (IL)‑23/IL‑12, wingless‑related integration site, IL‑6‑associated signal transducer and activator of transcription 3, suppressor of cytokine signaling 3 and apoptosis signaling pathways are involved in DNAm and in the pathogenesis of IBD. It has emerged that DNAm‑associated genes are involved in perpetuating the persistent inflammation that characterizes a number of diseases, including IBD, providing a novel therapeutic strategy for exploring their treatment. The present review discusses DNAm‑associated genes in the pathogenesis of IBD and summarizes their application as possible diagnostic, prognostic and therapeutic biomarkers in IBD. This may provide a reference for the particular form of IBD and its related methylation genes, aiding in clinical decision‑making and encouraging therapeutic alternatives.
Collapse
Affiliation(s)
- Francis Atim Akanyibah
- Department of Laboratory Medicine, Lianyungang Clinical College, Jiangsu University, Lianyungang, Jiangsu 222006, P.R. China
| | - Yi Zhu
- The People's Hospital of Danyang, Affiliated Danyang Hospital of Nantong University, Zhenjiang, Jiangsu 212300, P.R. China
| | - Aijun Wan
- Zhenjiang College, Zhenjiang, Jiangsu 212028, P.R. China
| | - Dickson Kofi Wiredu Ocansey
- Key Laboratory of Medical Science and Laboratory Medicine of Jiangsu Province, School of Medicine, Jiangsu University, Zhenjiang, Jiangsu 212013, P.R. China
| | - Yuxuan Xia
- Key Laboratory of Medical Science and Laboratory Medicine of Jiangsu Province, School of Medicine, Jiangsu University, Zhenjiang, Jiangsu 212013, P.R. China
| | - An-Ning Fang
- Basic Medical School, Anhui Medical College, Hefei, Anhui 230061, P.R. China
| | - Fei Mao
- Department of Laboratory Medicine, Lianyungang Clinical College, Jiangsu University, Lianyungang, Jiangsu 222006, P.R. China
| |
Collapse
|
3
|
Saadh MJ, Mikhailova MV, Rasoolzadegan S, Falaki M, Akhavanfar R, Gonzáles JLA, Rigi A, Kiasari BA. Therapeutic potential of mesenchymal stem/stromal cells (MSCs)-based cell therapy for inflammatory bowel diseases (IBD) therapy. Eur J Med Res 2023; 28:47. [PMID: 36707899 PMCID: PMC9881387 DOI: 10.1186/s40001-023-01008-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 01/10/2023] [Indexed: 01/28/2023] Open
Abstract
Recently, mesenchymal stem/stromal cells (MSCs) therapy has become an emerging therapeutic modality for the treatment of inflammatory bowel disease (IBD), given their immunoregulatory and pro-survival attributes. MSCs alleviate dysregulated inflammatory responses through the secretion of a myriad of anti-inflammatory mediators, such as interleukin 10 (IL-10), transforming growth factor-β (TGFβ), prostaglandin E2 (PGE2), tumor necrosis factor-stimulated gene-6 (TSG-6), etc. Indeed, MSC treatment of IBD is largely carried out through local microcirculation construction, colonization and repair, and immunomodulation, thus alleviating diseases severity. The clinical therapeutic efficacy relies on to the marked secretion of various secretory molecules from viable MSCs via paracrine mechanisms that are required for gut immuno-microbiota regulation and the proliferation and differentiation of surrounding cells like intestinal epithelial cells (IECs) and intestinal stem cells (ISCs). For example, MSCs can induce IECs proliferation and upregulate the expression of tight junction (TJs)-associated protein, ensuring intestinal barrier integrity. Concerning the encouraging results derived from animal studies, various clinical trials are conducted or ongoing to address the safety and efficacy of MSCs administration in IBD patients. Although the safety and short-term efficacy of MSCs administration have been evinced, the long-term efficacy of MSCs transplantation has not yet been verified. Herein, we have emphasized the illumination of the therapeutic capacity of MSCs therapy, including naïve MSCs, preconditioned MSCs, and also MSCs-derived exosomes, to alleviate IBD severity in experimental models. Also, a brief overview of published clinical trials in IBD patients has been delivered.
Collapse
Affiliation(s)
- Mohamed J. Saadh
- grid.449114.d0000 0004 0457 5303Department of Basic Sciences, Faculty of Pharmacy, Middle East University, Amman, 11831 Jordan
| | - Maria V. Mikhailova
- grid.448878.f0000 0001 2288 8774I.M. Sechenov First Moscow State Medical University (Sechenov University), Moscow, Russia
| | - Soheil Rasoolzadegan
- grid.411600.2Department of Surgery, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mojgan Falaki
- grid.411600.2Department of Internal Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Roozbeh Akhavanfar
- grid.411036.10000 0001 1498 685XSchool of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | | | - Amir Rigi
- grid.411463.50000 0001 0706 2472Department of Nursing, Young Researchers and Elite Club, Zahedan Branch, Azad University, Zahedan, Iran
| | - Bahman Abedi Kiasari
- grid.46072.370000 0004 0612 7950Virology Department, Faculty of Veterinary Medicine, The University of Tehran, Tehran, Iran
| |
Collapse
|
4
|
Computational drug repurposing based on electronic health records: a scoping review. NPJ Digit Med 2022; 5:77. [PMID: 35701544 PMCID: PMC9198008 DOI: 10.1038/s41746-022-00617-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 05/19/2022] [Indexed: 11/30/2022] Open
Abstract
Computational drug repurposing methods adapt Artificial intelligence (AI) algorithms for the discovery of new applications of approved or investigational drugs. Among the heterogeneous datasets, electronic health records (EHRs) datasets provide rich longitudinal and pathophysiological data that facilitate the generation and validation of drug repurposing. Here, we present an appraisal of recently published research on computational drug repurposing utilizing the EHR. Thirty-three research articles, retrieved from Embase, Medline, Scopus, and Web of Science between January 2000 and January 2022, were included in the final review. Four themes, (1) publication venue, (2) data types and sources, (3) method for data processing and prediction, and (4) targeted disease, validation, and released tools were presented. The review summarized the contribution of EHR used in drug repurposing as well as revealed that the utilization is hindered by the validation, accessibility, and understanding of EHRs. These findings can support researchers in the utilization of medical data resources and the development of computational methods for drug repurposing.
Collapse
|
5
|
Zielinski MR, Gibbons AJ. Neuroinflammation, Sleep, and Circadian Rhythms. Front Cell Infect Microbiol 2022; 12:853096. [PMID: 35392608 PMCID: PMC8981587 DOI: 10.3389/fcimb.2022.853096] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 02/24/2022] [Indexed: 12/14/2022] Open
Abstract
Molecules involved in innate immunity affect sleep and circadian oscillators and vice versa. Sleep-inducing inflammatory molecules are activated by increased waking activity and pathogens. Pathologies that alter inflammatory molecules, such as traumatic brain injury, cancer, cardiovascular disease, and stroke often are associated with disturbed sleep and electroencephalogram power spectra. Moreover, sleep disorders, such as insomnia and sleep disordered breathing, are associated with increased dysregulation of inflammatory processes. Inflammatory molecules in both the central nervous system and periphery can alter sleep. Inflammation can also modulate cerebral vascular hemodynamics which is associated with alterations in electroencephalogram power spectra. However, further research is needed to determine the interactions of sleep regulatory inflammatory molecules and circadian clocks. The purpose of this review is to: 1) describe the role of the inflammatory cytokines interleukin-1 beta and tumor necrosis factor-alpha and nucleotide-binding domain and leucine-rich repeat protein-3 inflammasomes in sleep regulation, 2) to discuss the relationship between the vagus nerve in translating inflammatory signals between the periphery and central nervous system to alter sleep, and 3) to present information about the relationship between cerebral vascular hemodynamics and the electroencephalogram during sleep.
Collapse
Affiliation(s)
- Mark R. Zielinski
- Veterans Affairs (VA) Boston Healthcare System, West Roxbury, MA, United States,Harvard Medical School, West Roxbury, MA, United States,*Correspondence: Mark R. Zielinski,
| | - Allison J. Gibbons
- Veterans Affairs (VA) Boston Healthcare System, West Roxbury, MA, United States
| |
Collapse
|
6
|
Mukhopadhyay S, Saha S, Chakraborty S, Prasad P, Ghosh A, Aich P. Differential colitis susceptibility of Th1- and Th2-biased mice: A multi-omics approach. PLoS One 2022; 17:e0264400. [PMID: 35263357 PMCID: PMC8906622 DOI: 10.1371/journal.pone.0264400] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 02/09/2022] [Indexed: 01/08/2023] Open
Abstract
The health and economic burden of colitis is increasing globally. Understanding the role of host genetics and metagenomics is essential to establish the molecular basis of colitis pathogenesis. In the present study, we have used a common composite dose of DSS to compare the differential disease severity response in C57BL/6 (Th1 biased) and BALB/c (Th2 biased) mice with zero mortality rates. We employed multi-omics approaches and developed a newer vector analysis approach to understand the molecular basis of the disease pathogenesis. In the current report, comparative transcriptomics, metabonomics, and metagenomics analyses revealed that the Th1 background of C57BL/6 induced intense inflammatory responses throughout the treatment period. On the contrary, the Th2 background of BALB/c resisted severe inflammatory responses by modulating the host’s inflammatory, metabolic, and gut microbial profile. The multi-omics approach also helped us discover some unique metabolic and microbial markers associated with the disease severity. These biomarkers could be used in diagnostics.
Collapse
Affiliation(s)
- Sohini Mukhopadhyay
- School of Biological Sciences, National Institute of Science Education and Research (NISER), HBNI, Khurdha, Odisha, India
- Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai, India
| | - Subha Saha
- Institute of Life Sciences, NALCO Square, Bhubaneswar, Odisha, India
| | - Subhayan Chakraborty
- Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai, India
- School of Chemical Sciences, National Institute of Science Education and Research (NISER), HBNI, Khurdha, Odisha, India
| | - Punit Prasad
- Institute of Life Sciences, NALCO Square, Bhubaneswar, Odisha, India
| | - Arindam Ghosh
- Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai, India
- School of Chemical Sciences, National Institute of Science Education and Research (NISER), HBNI, Khurdha, Odisha, India
| | - Palok Aich
- School of Biological Sciences, National Institute of Science Education and Research (NISER), HBNI, Khurdha, Odisha, India
- Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai, India
| |
Collapse
|
7
|
Dong L, Zheng Q, Cheng Y, Zhou M, Wang M, Xu J, Xu Z, Wu G, Yu Y, Ye L, Feng Z. Gut Microbial Characteristics of Adult Patients With Epilepsy. Front Neurosci 2022; 16:803538. [PMID: 35250450 PMCID: PMC8888681 DOI: 10.3389/fnins.2022.803538] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 01/03/2022] [Indexed: 01/01/2023] Open
Abstract
ObjectiveTo characterize the intestinal flora of patients with epilepsy and its correlation with epilepsy.MethodsPatients with ages > 18 years were consecutively enrolled from the outpatient department, Affiliated Hospital of Guizhou Medical University from January 2018 to December 2019. A total of 71 subjects were recruited, including epilepsy patients (n = 41) as an observation group and patient family members (n = 30) as a control group. Fresh stool specimens of all the subjects were collected. The 16S ribosomal RNA sequencing was analyzed to determine changes in intestinal flora composition and its correlation with epilepsy. Subgroup analysis was then conducted. All patients with epilepsy were divided into an urban group (n = 21) and a rural group (n = 20) according to the region, and bioinformatics analyses were repeated between subgroups.ResultsLEfSe analysis showed that Fusobacterium, Megasphaera, Alloprevotella, and Sutterella had relatively increased abundance in the epilepsy group at the genus level. Correlation analysis suggested that Fusobacterium sp. (r = 0.584, P < 0.01), Fusobacterium mortiferum (r = 0.560, P < 0.01), Ruminococcus gnavus (r = 0.541, P < 0.01), and Bacteroides fragilis (r = 0.506, P < 0.01) were significantly positively correlated with the occurrence of epilepsy (r ≥ 0.5, P < 0.05). PICRUSt function prediction analysis showed that there were significant differences in 16 pathways between the groups at level 3. Comparing the rural group with the urban group, Proteobacteria increased at the phylum level and Escherichia coli, Fusobacterium varium, Prevotella stercorea, and Prevotellaceae bacterium DJF VR15 increased at the species level in the rural group.ConclusionThere were significant differences in the composition and functional pathways of gut flora between epilepsy patients and patient family members. The Fusobacterium may become a potential biomarker for the diagnosis of epilepsy.
Collapse
Affiliation(s)
- Lian Dong
- Department of Neurology, Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Qian Zheng
- Department of Neurology, Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Yongran Cheng
- School of Public Health, Hangzhou Medical College, Hangzhou, China
| | - Mengyun Zhou
- Department of Molecular and Cellular Physiology, Shinshu University School of Medicine, Nagano, Japan
| | - Mingwei Wang
- Department of Cardiology, Affiliated Hospital of Hangzhou Normal University, Hangzhou, China
| | - Jianwei Xu
- National Guizhou Joint Engineering Laboratory for Cell Engineering and Biomedicine Technique, Guizhou Province Key Laboratory of Regenerative Medicine, Center for Tissue Engineering and Stem Cell Research, Guizhou Medical University, Guiyang, China
| | - Zucai Xu
- Department of Neurology, Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Guofeng Wu
- Department of Neurology, Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Yunli Yu
- Department of Neurology, Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Lan Ye
- The Medical Function Laboratory of Experimental Teaching Center of Basic Medicine, Guizhou Medical University, Guiyang, China
- *Correspondence: Lan Ye,
| | - Zhanhui Feng
- Department of Neurology, Affiliated Hospital of Guizhou Medical University, Guiyang, China
- Zhanhui Feng,
| |
Collapse
|
8
|
Cheerkoot-Jalim S, Khedo KK. Literature-based discovery approaches for evidence-based healthcare: a systematic review. HEALTH AND TECHNOLOGY 2021; 11:1205-1217. [PMID: 34722102 PMCID: PMC8542914 DOI: 10.1007/s12553-021-00605-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 09/28/2021] [Indexed: 12/12/2022]
Abstract
Purpose Literature-Based Discovery (LBD) is a text mining technique used to generate novel hypotheses from vast amounts of literature sources, by identifying links between concepts from disparate sources. One of the main areas where it has been predominantly applied is the healthcare domain, whereby promising results, in the form of novel hypotheses, have been reported. The purpose of this work was to conduct a systematic literature review of recent publications on LBD in the healthcare domain in order to assess the trends in the approaches used and to identify issues and challenges for such systems. Methods The review was conducted following the principles of the Kitchenham method. The selected studies have been scrutinized and the derived findings have been reported following the PRISMA guidelines. Results The review results reveal useful information regarding the application areas, the data sources considered, the approaches used, the performance in terms of accuracy and reliability and future research challenges. The results of this review will be beneficial to LBD researchers and other stakeholders in the healthcare domain, by providing them with useful insights on the approaches to adopt, data sources to consider, evaluation model to use and challenges to reflect on. Conclusion The synthesis of the results of this work has shed light on recent issues and challenges that drive new LBD models and provides avenues for their application in other diverse areas in the healthcare domain. To the best of our knowledge, no such recent review has been conducted.
Collapse
Affiliation(s)
- Sudha Cheerkoot-Jalim
- Department of Information and Communication Technologies, University of Mauritius, Reduit, Mauritius
| | - Kavi Kumar Khedo
- Department of Digital Technologies, University of Mauritius, Reduit, Mauritius
| |
Collapse
|
9
|
Zhang R, Hristovski D, Schutte D, Kastrin A, Fiszman M, Kilicoglu H. Drug repurposing for COVID-19 via knowledge graph completion. J Biomed Inform 2021; 115:103696. [PMID: 33571675 PMCID: PMC7869625 DOI: 10.1016/j.jbi.2021.103696] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/23/2020] [Accepted: 02/01/2021] [Indexed: 02/07/2023]
Abstract
OBJECTIVE To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. METHODS We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from PubMed and other COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative and accurate subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant. We used this subset to construct a knowledge graph, and applied five state-of-the-art, neural knowledge graph completion algorithms (i.e., TransE, RotatE, DistMult, ComplEx, and STELP) to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. RESULTS Accuracy classifier based on PubMedBERT achieved the best performance (F1 = 0.854) in identifying accurate semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1 = 0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as others that have not yet been studied. Discovery patterns enabled identification of additional candidate drugs and generation of plausible hypotheses regarding the links between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (i.e., paclitaxel, SB 203580, alpha 2-antiplasmin, metoclopramide, and oxymatrine) and the mechanistic explanations for their potential use are further discussed. CONCLUSION We showed that a LBD approach can be feasible not only for discovering drug candidates for COVID-19, but also for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions. Source code and data are available at https://github.com/kilicogluh/lbd-covid.
Collapse
Affiliation(s)
- Rui Zhang
- Institute for Health Informatics and Department of Pharmaceutical Care & Health Systems, University of Minnesota, MN, USA.
| | - Dimitar Hristovski
- Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Dalton Schutte
- Institute for Health Informatics and Department of Pharmaceutical Care & Health Systems, University of Minnesota, MN, USA
| | - Andrej Kastrin
- Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Marcelo Fiszman
- NITES - Núcleo de Inovação e Tecnologia Em Saúde, Pontifical Catholic University of Rio de Janeiro, Brazil
| | - Halil Kilicoglu
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| |
Collapse
|
10
|
Vazifehkhah S, Khanizadeh AM, Mojarad TB, Nikbakht F. The possible role of progranulin on anti-inflammatory effects of metformin in temporal lobe epilepsy. J Chem Neuroanat 2020; 109:101849. [DOI: 10.1016/j.jchemneu.2020.101849] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 07/11/2020] [Accepted: 07/12/2020] [Indexed: 12/21/2022]
|
11
|
Kilicoglu H, Rosemblat G, Fiszman M, Shin D. Broad-coverage biomedical relation extraction with SemRep. BMC Bioinformatics 2020; 21:188. [PMID: 32410573 PMCID: PMC7222583 DOI: 10.1186/s12859-020-3517-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 04/29/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the era of information overload, natural language processing (NLP) techniques are increasingly needed to support advanced biomedical information management and discovery applications. In this paper, we present an in-depth description of SemRep, an NLP system that extracts semantic relations from PubMed abstracts using linguistic principles and UMLS domain knowledge. We also evaluate SemRep on two datasets. In one evaluation, we use a manually annotated test collection and perform a comprehensive error analysis. In another evaluation, we assess SemRep's performance on the CDR dataset, a standard benchmark corpus annotated with causal chemical-disease relationships. RESULTS A strict evaluation of SemRep on our manually annotated dataset yields 0.55 precision, 0.34 recall, and 0.42 F 1 score. A relaxed evaluation, which more accurately characterizes SemRep performance, yields 0.69 precision, 0.42 recall, and 0.52 F 1 score. An error analysis reveals named entity recognition/normalization as the largest source of errors (26.9%), followed by argument identification (14%) and trigger detection errors (12.5%). The evaluation on the CDR corpus yields 0.90 precision, 0.24 recall, and 0.38 F 1 score. The recall and the F 1 score increase to 0.35 and 0.50, respectively, when the evaluation on this corpus is limited to sentence-bound relationships, which represents a fairer evaluation, as SemRep operates at the sentence level. CONCLUSIONS SemRep is a broad-coverage, interpretable, strong baseline system for extracting semantic relations from biomedical text. It also underpins SemMedDB, a literature-scale knowledge graph based on semantic relations. Through SemMedDB, SemRep has had significant impact in the scientific community, supporting a variety of clinical and translational applications, including clinical decision making, medical diagnosis, drug repurposing, literature-based discovery and hypothesis generation, and contributing to improved health outcomes. In ongoing development, we are redesigning SemRep to increase its modularity and flexibility, and addressing weaknesses identified in the error analysis.
Collapse
Affiliation(s)
- Halil Kilicoglu
- Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike, Bethesda, 20894 MD USA
- University of Illinois at Urbana-Champaign, School of Information Sciences, 501 E Daniel Street, Champaign, 61820 IL USA
| | - Graciela Rosemblat
- Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike, Bethesda, 20894 MD USA
| | | | - Dongwook Shin
- Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike, Bethesda, 20894 MD USA
| |
Collapse
|