1
|
Cai L, Li J, Lv H, Liu W, Niu H, Wang Z. Integrating domain knowledge for biomedical text analysis into deep learning: A survey. J Biomed Inform 2023; 143:104418. [PMID: 37290540 DOI: 10.1016/j.jbi.2023.104418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/24/2023] [Accepted: 05/31/2023] [Indexed: 06/10/2023]
Abstract
The past decade has witnessed an explosion of textual information in the biomedical field. Biomedical texts provide a basis for healthcare delivery, knowledge discovery, and decision-making. Over the same period, deep learning has achieved remarkable performance in biomedical natural language processing, however, its development has been limited by well-annotated datasets and interpretability. To solve this, researchers have considered combining domain knowledge (such as biomedical knowledge graph) with biomedical data, which has become a promising means of introducing more information into biomedical datasets and following evidence-based medicine. This paper comprehensively reviews more than 150 recent literature studies on incorporating domain knowledge into deep learning models to facilitate typical biomedical text analysis tasks, including information extraction, text classification, and text generation. We eventually discuss various challenges and future directions.
Collapse
Affiliation(s)
- Linkun Cai
- School of Biological Science and Medical Engineering, Beihang University, 100191 Beijing, China
| | - Jia Li
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, 100050 Beijing, China
| | - Han Lv
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, 100050 Beijing, China
| | - Wenjuan Liu
- Aerospace Center Hospital, 100049 Beijing, China
| | - Haijun Niu
- School of Biological Science and Medical Engineering, Beihang University, 100191 Beijing, China
| | - Zhenchang Wang
- School of Biological Science and Medical Engineering, Beihang University, 100191 Beijing, China; Department of Radiology, Beijing Friendship Hospital, Capital Medical University, 100050 Beijing, China.
| |
Collapse
|
2
|
Murali L, Gopakumar G, Viswanathan DM, Nedungadi P. Towards electronic health record-based medical knowledge graph construction, completion, and applications: A literature study. J Biomed Inform 2023:104403. [PMID: 37230406 DOI: 10.1016/j.jbi.2023.104403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/16/2023] [Accepted: 05/19/2023] [Indexed: 05/27/2023]
Abstract
With the growth of data and intelligent technologies, the healthcare sector opened numerous technology that enabled services for patients, clinicians, and researchers. One major hurdle in achieving state-of-the-art results in health informatics is domain-specific terminologies and their semantic complexities. A knowledge graph crafted from medical concepts, events, and relationships acts as a medical semantic network to extract new links and hidden patterns from health data sources. Current medical knowledge graph construction studies are limited to generic techniques and opportunities and focus less on exploiting real-world data sources in knowledge graph construction. A knowledge graph constructed from Electronic Health Records (EHR) data obtains real-world data from healthcare records. It ensures better results in subsequent tasks like knowledge extraction and inference, knowledge graph completion, and medical knowledge graph applications such as diagnosis predictions, clinical recommendations, and clinical decision support. This review critically analyses existing works on medical knowledge graphs that used EHR data as the data source at (i) representation level, (ii) extraction level (iii) completion level. In this investigation, we found that EHR-based knowledge graph construction involves challenges such as high complexity and dimensionality of data, lack of knowledge fusion, and dynamic update of the knowledge graph. In addition, the study presents possible ways to tackle the challenges identified. Our findings conclude that future research should focus on knowledge graph integration and knowledge graph completion challenges.
Collapse
Affiliation(s)
- Lino Murali
- Center for Research in Analytics and Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India; Division of Information technology, School of Engineering, Cochin University of Science and Technology, Kochi, 682022, Kerala, India
| | - G Gopakumar
- Department of Computer Science and Engineering, School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India
| | - Daleesha M Viswanathan
- Division of Information technology, School of Engineering, Cochin University of Science and Technology, Kochi, 682022, Kerala, India
| | - Prema Nedungadi
- Center for Research in Analytics and Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India; Department of Computer Science and Engineering, School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India.
| |
Collapse
|
3
|
Tian Y, Jin Y, Li Z, Liu J, Liu C. Weighted Heterogeneous Graph-Based Incremental Automatic Disease Diagnosis Method. JOURNAL OF SHANGHAI JIAOTONG UNIVERSITY (SCIENCE) 2022; 29:1-11. [PMID: 36540092 PMCID: PMC9755782 DOI: 10.1007/s12204-022-2537-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 08/30/2022] [Indexed: 12/23/2022]
Abstract
The objective of this study is to construct a multi-department symptom-based automatic diagnosis model. However, it is difficult to establish a model to classify plenty of diseases and collect thousands of disease-symptom datasets simultaneously. Inspired by the thought of "knowledge graph is model", this study proposes to build an experience-infused knowledge model by continuously learning the experiential knowledge from data, and incrementally injecting it into the knowledge graph. Therefore, incremental learning and injection are used to solve the data collection problem, and the knowledge graph is modeled and containerized to solve the large-scale multi-classification problems. First, an entity linking method is designed and a heterogeneous knowledge graph is constructed by graph fusion. Then, an adaptive neural network model is constructed for each dataset, and the data is used for statistical initialization and model training. Finally, the weights and biases of the learned neural network model are updated to the knowledge graph. It is worth noting that for the incremental process, we consider both the data and class increments. We evaluate the diagnostic effectiveness of the model on the current dataset and the anti-forgetting ability on the historical dataset after class increment on three public datasets. Compared with the classical model, the proposed model improves the diagnostic accuracy of the three datasets by 5%, 2%, and 15% on average, respectively. Meanwhile, the model under incremental learning has a better ability to resist forgetting.
Collapse
Affiliation(s)
- Yuanyuan Tian
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Yanrui Jin
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Zhiyuan Li
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Jinlei Liu
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Chengliang Liu
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China
| |
Collapse
|
4
|
Chen J, Guo C, Lu M, Ding S. Unifying Diagnosis Identification and Prediction Method Embedding the Disease Ontology Structure From Electronic Medical Records. Front Public Health 2022; 9:793801. [PMID: 35127624 PMCID: PMC8811031 DOI: 10.3389/fpubh.2021.793801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE The reasonable classification of a large number of distinct diagnosis codes can clarify patient diagnostic information and help clinicians to improve their ability to assign and target treatment for primary diseases. Our objective is to identify and predict a unifying diagnosis (UD) from electronic medical records (EMRs). METHODS We screened 4,418 sepsis patients from a public MIMIC-III database and extracted their diagnostic information for UD identification, their demographic information, laboratory examination information, chief complaint, and history of present illness information for UD prediction. We proposed a data-driven UD identification and prediction method (UDIPM) embedding the disease ontology structure. First, we designed a set similarity measure method embedding the disease ontology structure to generate a patient similarity matrix. Second, we applied affinity propagation clustering to divide patients into different clusters, and extracted a typical diagnosis code co-occurrence pattern from each cluster. Furthermore, we identified a UD by fusing visual analysis and a conditional co-occurrence matrix. Finally, we trained five classifiers in combination with feature fusion and feature selection method to unify the diagnosis prediction. RESULTS The experimental results on a public electronic medical record dataset showed that the UDIPM could extracted a typical diagnosis code co-occurrence pattern effectively, identified and predicted a UD based on patients' diagnostic and admission information, and outperformed other fusion methods overall. CONCLUSIONS The accurate identification and prediction of the UD from a large number of distinct diagnosis codes and multi-source heterogeneous patient admission information in EMRs can provide a data-driven approach to assist better coding integration of diagnosis.
Collapse
Affiliation(s)
- Jingfeng Chen
- Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Chonghui Guo
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Menglin Lu
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Suying Ding
- Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| |
Collapse
|
5
|
AIM in Electronic Health Records (EHRs). Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
6
|
Concatenation of Pre-Trained Convolutional Neural Networks for Enhanced COVID-19 Screening Using Transfer Learning Technique. ELECTRONICS 2021. [DOI: 10.3390/electronics11010103] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Coronavirus (COVID-19) is the most prevalent coronavirus infection with respiratory symptoms such as fever, cough, dyspnea, pneumonia, and weariness being typical in the early stages. On the other hand, COVID-19 has a direct impact on the circulatory and respiratory systems as it causes a failure to some human organs or severe respiratory distress in extreme circumstances. Early diagnosis of COVID-19 is extremely important for the medical community to limit its spread. For a large number of suspected cases, manual diagnostic methods based on the analysis of chest images are insufficient. Faced with this situation, artificial intelligence (AI) techniques have shown great potential in automatic diagnostic tasks. This paper aims at proposing a fast and precise medical diagnosis support system (MDSS) that can distinguish COVID-19 precisely in chest-X-ray images. This MDSS uses a concatenation technique that aims to combine pre-trained convolutional neural networks (CNN) depend on the transfer learning (TL) technique to build a highly accurate model. The models enable storage and application of knowledge learned from a pre-trained CNN to a new task, viz., COVID-19 case detection. For this purpose, we employed the concatenation method to aggregate the performances of numerous pre-trained models to confirm the reliability of the proposed method for identifying the patients with COVID-19 disease from X-ray images. The proposed system was trialed on a dataset that included four classes: normal, viral-pneumonia, tuberculosis, and COVID-19 cases. Various general evaluation methods were used to evaluate the effectiveness of the proposed model. The first proposed model achieved an accuracy rate of 99.80% while the second model reached an accuracy of 99.71%.
Collapse
|
7
|
Yasnitsky LN, Dumler AA, Cherepanov FM, Yasnitsky VL, Uteva NA. Capabilities of neural network technologies for extracting new medical knowledge and enhancing precise decision making for patients. EXPERT REVIEW OF PRECISION MEDICINE AND DRUG DEVELOPMENT 2021. [DOI: 10.1080/23808993.2021.1993595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Leonid N. Yasnitsky
- Department of Applied Mathematics and Informatics, Perm State University, Perm, Russia
- Department of Information Technology in Business, Higher School of Economics, Perm, Russia
| | - Andrey A. Dumler
- Department of Propedeutics of Internal Diseases No. 1, Perm State Medical University, Perm, Russia
| | - Fedor M. Cherepanov
- Department of Applied Informatics, Perm State University for Humanities and Pedagogy, Perm, Russia
| | - Vitaly L. Yasnitsky
- Department of Computational Mathematics, Mechanics and Biomechanics, Perm National Research Polytechnic University, Perm, Russia
| | - Natalia A. Uteva
- Department of Propedeutics of Internal Diseases No. 1, Perm State Medical University, Perm, Russia
| |
Collapse
|
8
|
AIM in Electronic Health Records (EHRs). Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_47-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
9
|
Guan Y, Jiang J. AIM in Electronic Health Records (EHRs). Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_47-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|