1
|
Datta S, Roberts K. A Hybrid Deep Learning Approach for Spatial Trigger Extraction from Radiology Reports. PROCEEDINGS OF THE CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING 2020; 2020:50-55. [PMID: 33336212 PMCID: PMC7744270 DOI: 10.18653/v1/2020.splu-1.6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Radiology reports contain important clinical information about patients which are often tied through spatial expressions. Spatial expressions (or triggers) are mainly used to describe the positioning of radiographic findings or medical devices with respect to some anatomical structures. As the expressions result from the mental visualization of the radiologist's interpretations, they are varied and complex. The focus of this work is to automatically identify the spatial expression terms from three different radiology sub-domains. We propose a hybrid deep learning-based NLP method that includes - 1) generating a set of candidate spatial triggers by exact match with the known trigger terms from the training data, 2) applying domain-specific constraints to filter the candidate triggers, and 3) utilizing a BERT-based classifier to predict whether a candidate trigger is a true spatial trigger or not. The results are promising, with an improvement of 24 points in the average F1 measure compared to a standard BERT-based sequence labeler.
Collapse
Affiliation(s)
- Surabhi Datta
- School of Biomedical Informatics, University of Texas Health Science Center at Houston Houston TX, USA
| | - Kirk Roberts
- School of Biomedical Informatics, University of Texas Health Science Center at Houston Houston TX, USA
| |
Collapse
|
2
|
Wen G, Chen H, Li H, Hu Y, Li Y, Wang C. Cross domains adversarial learning for Chinese named entity recognition for online medical consultation. J Biomed Inform 2020; 112:103608. [PMID: 33132138 DOI: 10.1016/j.jbi.2020.103608] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 10/19/2020] [Accepted: 10/22/2020] [Indexed: 11/19/2022]
Abstract
Deep learning methods have been applied to Chinese named entity recognition for the online medical consultation. They require a large number of marked samples. However, no such database is available at present. This paper begins with constructing a larger labelled Chinese texts database for the online medical consultation. Second, a basic framework unit is proposed, which is pre-trained by the transfer learning from both Bidirectional language model and Mask language model trained on the larger unlabelled data. Finally, cross domains adversarial learning (CDAL) for Chinese named entity recognition is proposed to further improve the performance, which not only uses the pre-trained basic framework unit, but also uses the adversarial multi-task learning on both electronic medical record texts and online medical consultation texts. Experimental results validate the effectiveness of CDAL.
Collapse
Affiliation(s)
- Guihua Wen
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Hehong Chen
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Huihui Li
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China.
| | - Yang Hu
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Yanghui Li
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Changjun Wang
- Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| |
Collapse
|
3
|
An Y, Wang J, Zhang L, Zhao H, Gao Z, Huang H, Du Z, Jiao Z, Yan J, Wei X, Jin B. PASCAL: a pseudo cascade learning framework for breast cancer treatment entity normalization in Chinese clinical text. BMC Med Inform Decis Mak 2020; 20:204. [PMID: 32859189 PMCID: PMC7456389 DOI: 10.1186/s12911-020-01216-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 08/12/2020] [Indexed: 12/04/2022] Open
Abstract
Backgrounds Knowledge discovery from breast cancer treatment records has promoted downstream clinical studies such as careflow mining and therapy analysis. However, the clinical treatment text from electronic health data might be recorded by different doctors under their hospital guidelines, making the final data rich in author- and domain-specific idiosyncrasies. Therefore, breast cancer treatment entity normalization becomes an essential task for the above downstream clinical studies. The latest studies have demonstrated the superiority of deep learning methods in named entity normalization tasks. Fundamentally, most existing approaches adopt pipeline implementations that treat it as an independent process after named entity recognition, which can propagate errors to later tasks. In addition, despite its importance in clinical and translational research, few studies directly deal with the normalization task in Chinese clinical text due to the complexity of composition forms. Methods To address these issues, we propose PASCAL, an end-to-end and accurate framework for breast cancer treatment entity normalization (TEN). PASCAL leverages a gated convolutional neural network to obtain a representation vector that can capture contextual features and long-term dependencies. Additionally, it treats treatment entity recognition (TER) as an auxiliary task that can provide meaningful information to the primary TEN task and as a particular regularization to further optimize the shared parameters. Finally, by concatenating the context-aware vector and probabilistic distribution vector from TEN, we utilize the conditional random field layer (CRF) to model the normalization sequence and predict the TEN sequential results. Results To evaluate the effectiveness of the proposed framework, we employ the three latest sequential models as baselines and build the model in single- and multitask on a real-world database. Experimental results show that our method achieves better accuracy and efficiency than state-of-the-art approaches. Conclusions The effectiveness and efficiency of the presented pseudo cascade learning framework were validated for breast cancer treatment normalization in clinical text. We believe the predominant performance lies in its ability to extract valuable information from unstructured text data, which will significantly contribute to downstream tasks, such as treatment recommendations, breast cancer staging and careflow mining.
Collapse
Affiliation(s)
- Yang An
- School of Computer Science and Technology, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, Liaoning, 116024, China
| | - Jianlin Wang
- First Hospital of Lanzhou University, 1 Donggang W Rd, Chengguan District, Lanzhou, Gansu, 730000, China
| | - Liang Zhang
- International Bussiness College, Dongbei University of Finance and Economics, No.20 Jianshan Street, Shahekou District, Dalian, Liaoning, 116025, China.
| | - Hanyu Zhao
- Dalian University, No.10 Xuefu Street, Economic and Technological Development Zone, Dalian, Liaoning, 116622, China
| | - Zhan Gao
- BeiJing Haoyisheng Cloud Hospital Management Technology Ltd., No.10 Dewai Street, Xicheng District, Beijing, 100088, China
| | - Haitao Huang
- The People's Hospital of Liaoning Province, No.33 Shenhe District, Shenyang, Liaoning, 110016, China
| | - Zhenguang Du
- The People's Hospital of Liaoning Province, No.33 Shenhe District, Shenyang, Liaoning, 110016, China
| | - Zengtao Jiao
- AI Lab, Yidu Cloud, No.35 of Huayuan North Road, Haidian District, Beijing, 100191, China
| | - Jun Yan
- AI Lab, Yidu Cloud, No.35 of Huayuan North Road, Haidian District, Beijing, 100191, China
| | - Xiaopeng Wei
- School of Computer Science and Technology, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, Liaoning, 116024, China
| | - Bo Jin
- School of Innovation and Entrepreneurship, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, Liaoning, 116024, China
| |
Collapse
|