1
|
Li D, Wu P, Dong Y, Gu J, Qian L, Zhou G. Joint learning-based causal relation extraction from biomedical literature. J Biomed Inform 2023; 139:104318. [PMID: 36781035 DOI: 10.1016/j.jbi.2023.104318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 02/03/2023] [Accepted: 02/08/2023] [Indexed: 02/13/2023]
Abstract
Causal relation extraction of biomedical entities is one of the most complex tasks in biomedical text mining, which involves two kinds of information: entity relations and entity functions. One feasible approach is to take relation extraction and function detection as two independent sub-tasks. However, this separate learning method ignores the intrinsic correlation between them and leads to unsatisfactory performance. In this paper, we propose a joint learning model, which combines entity relation extraction and entity function detection to exploit their commonality and capture their inter-relationship, so as to improve the performance of biomedical causal relation extraction. Experimental results on the BioCreative-V Track 4 corpus show that our joint learning model outperforms the separate models in BEL statement extraction, achieving the F1 scores of 57.0% and 37.3% on the test set in Stage 2 and Stage 1 evaluations, respectively. This demonstrates that our joint learning system reaches the state-of-the-art performance in Stage 2 compared with other systems.
Collapse
Affiliation(s)
- Dongling Li
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu Province 215006, China.
| | - Pengchao Wu
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu Province 215006, China.
| | - Yuehu Dong
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu Province 215006, China.
| | - Jinghang Gu
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong 999077, China.
| | - Longhua Qian
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu Province 215006, China.
| | - Guodong Zhou
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu Province 215006, China.
| |
Collapse
|
2
|
Madan S, Szostak J, Komandur Elayavilli R, Tsai RTH, Ali M, Qian L, Rastegar-Mojarad M, Hoeng J, Fluck J. The extraction of complex relationships and their conversion to biological expression language (BEL) overview of the BioCreative VI (2017) BEL track. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2019:5585579. [PMID: 31603193 PMCID: PMC6787548 DOI: 10.1093/database/baz084] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 05/22/2019] [Accepted: 05/31/2019] [Indexed: 01/12/2023]
Abstract
Knowledge of the molecular interactions of biological and chemical entities and their involvement in biological processes or clinical phenotypes is important for data interpretation. Unfortunately, this knowledge is mostly embedded in the literature in such a way that it is unavailable for automated data analysis procedures. Biological expression language (BEL) is a syntax representation allowing for the structured representation of a broad range of biological relationships. It is used in various situations to extract such knowledge and transform it into BEL networks. To support the tedious and time-intensive extraction work of curators with automated methods, we developed the BEL track within the framework of BioCreative Challenges. Within the BEL track, we provide training data and an evaluation environment to encourage the text mining community to tackle the automatic extraction of complex BEL relationships. In 2017 BioCreative VI, the 2015 BEL track was repeated with new test data. Although only minor improvements in text snippet retrieval for given statements were achieved during this second BEL task iteration, a significant increase of BEL statement extraction performance from provided sentences could be seen. The best performing system reached a 32% F-score for the extraction of complete BEL statements and with the given named entities this increased to 49%. This time, besides rule-based systems, new methods involving hierarchical sequence labeling and neural networks were applied for BEL statement extraction.
Collapse
Affiliation(s)
- Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany
| | - Justyna Szostak
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchatel, Switzerland
| | | | - Richard Tzong-Han Tsai
- Department of Computer Science and Information Engineering, National Central University, Taiwan, R.O.C., Taiwan 320
| | - Mehdi Ali
- Friedrich Wilhelm University of Bonn, 53012 Bonn, Germany
| | - Longhua Qian
- NLP Lab, School of Computer Science and Technology, Soochow University, Suzhou, 215006 Suzhou, China
| | - Majid Rastegar-Mojarad
- Department of Health Sciences Research, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| | - Julia Hoeng
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchatel, Switzerland
| | - Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany
| |
Collapse
|
3
|
Lai PT, Lo YY, Huang MS, Hsiao YC, Tsai RTH. BelSmile: a biomedical semantic role labeling approach for extracting biological expression language from text. Database (Oxford) 2016; 2016:baw064. [PMID: 27173520 PMCID: PMC4865328 DOI: 10.1093/database/baw064] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2015] [Revised: 04/08/2016] [Accepted: 04/11/2016] [Indexed: 02/04/2023]
Abstract
Biological expression language (BEL) is one of the most popular languages to represent the causal and correlative relationships among biological events. Automatically extracting and representing biomedical events using BEL can help biologists quickly survey and understand relevant literature. Recently, many researchers have shown interest in biomedical event extraction. However, the task is still a challenge for current systems because of the complexity of integrating different information extraction tasks such as named entity recognition (NER), named entity normalization (NEN) and relation extraction into a single system. In this study, we introduce our BelSmile system, which uses a semantic-role-labeling (SRL)-based approach to extract the NEs and events for BEL statements. BelSmile combines our previous NER, NEN and SRL systems. We evaluate BelSmile using the BioCreative V BEL task dataset. Our system achieved an F-score of 27.8%, ∼7% higher than the top BioCreative V system. The three main contributions of this study are (i) an effective pipeline approach to extract BEL statements, and (ii) a syntactic-based labeler to extract subject-verb-object tuples. We also implement a web-based version of BelSmile (iii) that is publicly available at iisrserv.csie.ncu.edu.tw/belsmile.
Collapse
Affiliation(s)
- Po-Ting Lai
- Department of Computer Science, National Tsing-Hua University, No. 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan 30013, Republic of China
| | - Yu-Yan Lo
- Department of Computer Science and Information Engineering, National Central University, No. 300, Zhongda Road, Zhongli, Taoyuan, Taiwan 320, Republic of China and
| | - Ming-Siang Huang
- Department of Clinical Laboratory Sciences and Medical Biotechnology, College of Medicine, National Taiwan University, No.1, Section 1, Renai Road, Taipei, Taiwan 10002, Republic of China
| | - Yu-Cheng Hsiao
- Department of Computer Science and Information Engineering, National Central University, No. 300, Zhongda Road, Zhongli, Taoyuan, Taiwan 320, Republic of China and
| | - Richard Tzong-Han Tsai
- Department of Computer Science and Information Engineering, National Central University, No. 300, Zhongda Road, Zhongli, Taoyuan, Taiwan 320, Republic of China and
| |
Collapse
|