1
|
Shyr C, Hu Y, Bastarache L, Cheng A, Hamid R, Harris P, Xu H. Identifying and Extracting Rare Diseases and Their Phenotypes with Large Language Models. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:438-461. [PMID: 38681753 PMCID: PMC11052982 DOI: 10.1007/s41666-023-00155-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/24/2023] [Accepted: 11/13/2023] [Indexed: 05/01/2024]
Abstract
Purpose Phenotyping is critical for informing rare disease diagnosis and treatment, but disease phenotypes are often embedded in unstructured text. While natural language processing (NLP) can automate extraction, a major bottleneck is developing annotated corpora. Recently, prompt learning with large language models (LLMs) has been shown to lead to generalizable results without any (zero-shot) or few annotated samples (few-shot), but none have explored this for rare diseases. Our work is the first to study prompt learning for identifying and extracting rare disease phenotypes in the zero- and few-shot settings. Methods We compared the performance of prompt learning with ChatGPT and fine-tuning with BioClinicalBERT. We engineered novel prompts for ChatGPT to identify and extract rare diseases and their phenotypes (e.g., diseases, symptoms, and signs), established a benchmark for evaluating its performance, and conducted an in-depth error analysis. Results Overall, fine-tuning BioClinicalBERT resulted in higher performance (F1 of 0.689) than ChatGPT (F1 of 0.472 and 0.610 in the zero- and few-shot settings, respectively). However, ChatGPT achieved higher accuracy for rare diseases and signs in the one-shot setting (F1 of 0.778 and 0.725). Conversational, sentence-based prompts generally achieved higher accuracy than structured lists. Conclusion Prompt learning using ChatGPT has the potential to match or outperform fine-tuning BioClinicalBERT at extracting rare diseases and signs with just one annotated sample. Given its accessibility, ChatGPT could be leveraged to extract these entities without relying on a large, annotated corpus. While LLMs can support rare disease phenotyping, researchers should critically evaluate model outputs to ensure phenotyping accuracy.
Collapse
Affiliation(s)
- Cathy Shyr
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Yan Hu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77225 USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Alex Cheng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Rizwan Hamid
- Division of Medical Genetics and Genomic Medicine, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Paul Harris
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
- Department of Biomedical Engineering, Vanderbilt University Medical Center, 2525 West End Avenue, Nashville, TN 37203 USA
| | - Hua Xu
- Section of Biomedical Informatics and Data Science, Yale School of Medicine, 100 College Street, New Haven, CT 06510 USA
| |
Collapse
|
2
|
Wang W, Zhao Z, Ning H. A tree-based corpus annotated with Cyber-Syndrome, symptoms, and acupoints. Sci Data 2024; 11:482. [PMID: 38730023 PMCID: PMC11087536 DOI: 10.1038/s41597-024-03321-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/29/2024] [Indexed: 05/12/2024] Open
Abstract
Prolonged and over-excessive interaction with cyberspace poses a threat to people's health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndrome, clinical manifestations, and acupoints that can alleviate their symptoms or signs, designating this corpus as CS-A. In the CS-A corpus, this paper defines six entities and relations subject to annotation. There are 448 texts to annotate in total manually. After three rounds of updating the annotation guidelines, the inter-annotator agreement (IAA) improved significantly, resulting in a higher IAA score of 86.05%. The purpose of constructing CS-A corpus is to increase the popularity of Cyber-Syndrome and draw attention to its subtle impact on people's health. Meanwhile, annotated corpus promotes the development of natural language processing technology. Some model experiments can be implemented based on this corpus, such as optimizing and improving models for discontinuous entity recognition, nested entity recognition, etc. The CS-A corpus has been uploaded to figshare.
Collapse
Affiliation(s)
- Wenxi Wang
- School of Computer & Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Zhan Zhao
- School of Computer & Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Huansheng Ning
- School of Computer & Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China.
| |
Collapse
|
3
|
Zhang J, Xu W, Lei C, Pu Y, Zhang Y, Zhang J, Yu H, Su X, Huang Y, Gong R, Zhang L, Shi Q. Using Clinician-Patient WeChat Group Communication Data to Identify Symptom Burdens in Patients With Uterine Fibroids Under Focused Ultrasound Ablation Surgery Treatment: Qualitative Study. JMIR Form Res 2023; 7:e43995. [PMID: 37656501 PMCID: PMC10504630 DOI: 10.2196/43995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 12/26/2022] [Accepted: 07/24/2023] [Indexed: 09/02/2023] Open
Abstract
BACKGROUND Unlike research project-based health data collection (questionnaires and interviews), social media platforms allow patients to freely discuss their health status and obtain peer support. Previous literature has pointed out that both public and private social platforms can serve as data sources for analysis. OBJECTIVE This study aimed to use natural language processing (NLP) techniques to identify concerns regarding the postoperative quality of life and symptom burdens in patients with uterine fibroids after focused ultrasound ablation surgery. METHODS Screenshots taken from clinician-patient WeChat groups were converted into free texts using image text recognition technology and used as the research object of this study. From 408 patients diagnosed with uterine fibroids in Chongqing Haifu Hospital between 2010 and 2020, we searched for symptom burdens in over 900,000 words of WeChat group chats. We first built a corpus of symptoms by manually coding 30% of the WeChat texts and then used regular expressions in Python to crawl symptom information from the remaining texts based on this corpus. We compared the results with a manual review (gold standard) of the same records. Finally, we analyzed the relationship between the population baseline data and conceptual symptoms; quantitative and qualitative results were examined. RESULTS A total of 408 patients with uterine fibroids were included in the study; 190,000 words of free text were obtained after data cleaning. The mean age of the patients was 39.94 (SD 6.81) years, and their mean BMI was 22.18 (SD 2.78) kg/m2. The median reporting times of the 7 major symptoms were 21, 26, 57, 2, 18, 30, and 49 days. Logistic regression models identified preoperative menstrual duration (odds ratio [OR] 1.14, 95% CI 5.86-6.37; P=.009), age of menophania (OR -1.02 , 95% CI 11.96-13.47; P=.03), and the number (OR 2.34, 95% CI 1.45-1.83; P=.04) and size of fibroids (OR 0.12, 95% CI 2.43-3.51; P=.04) as significant risk factors for postoperative symptoms. CONCLUSIONS Unstructured free texts from social media platforms extracted by NLP technology can be used for analysis. By extracting the conceptual information about patients' health-related quality of life, we can adopt personalized treatment for patients at different stages of recovery to improve their quality of life. Python-based text mining of free-text data can accurately extract symptom burden and save considerable time compared to manual review, maximizing the utility of the extant information in population-based electronic health records for comparative effectiveness research.
Collapse
Affiliation(s)
- Jiayuan Zhang
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Wei Xu
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Cheng Lei
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Yang Pu
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Yubo Zhang
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Jingyu Zhang
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Hongfan Yu
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Xueyao Su
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Yanyan Huang
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Ruoyan Gong
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Lijun Zhang
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Qiuling Shi
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China
- School of Public Health, Chongqing Medical University, Chongqing, China
| |
Collapse
|
4
|
Hens D, Wyers L, Claeys KG. Validation of an Artificial Intelligence driven framework to automatically detect red flag symptoms in screening for rare diseases in electronic health records: hereditary transthyretin amyloidosis polyneuropathy as a key example. J Peripher Nerv Syst 2023; 28:79-85. [PMID: 36468607 DOI: 10.1111/jns.12523] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 11/19/2022] [Accepted: 11/28/2022] [Indexed: 12/07/2022]
Abstract
Rare life-threatening conditions, such as multisystemic hereditary transthyretin amyloidosis (ATTRv) polyneuropathy, are often underdiagnosed or diagnosed late in the disease course, although early diagnosis is crucial for treatment success. Red flag symptoms have been identified, but manual screening of multidisciplinary medical records on this set of symptoms is time-consuming. This study aimed to validate a Natural Language Processing (NLP) algorithm to perform such a search in an automated manner, in order to improve early diagnosis and treatment. A novel state-of-the-art NLP procedure was applied to extract red flag symptoms from patients' electronic medical records and to select patients at risk for ATTRv polyneuropathy for further clinical review. Accuracy of the algorithm was assessed through comparison with a manual standard on a random sample of 300 patients. Out of a retrospective sample of 1015 patients, the NLP algorithm yielded 128 patients with three or more red flag symptoms of which 69 patients were considered eligible for genetic testing after clinical review. High accuracy was found in the detection of red flag symptoms, with F1 scores between 0.88 and 0.98. A relative increase of 48.6% in genetic testing, to identify patients with a rare disease earlier, was demonstrated. An NLP algorithm, after clinical validation, offers a valid and accurate tool to detect red flag symptoms in medical records across multiple disciplines, supporting better screening for patients with rare diseases. This opens the door to further NLP applications, facilitating rapid diagnosis and early treatment of rare diseases.
Collapse
Affiliation(s)
| | | | - Kristl G Claeys
- Department of Neurology, University Hospitals Leuven, Leuven, Belgium.,Laboratory for Muscle Diseases and Neuropathies, Department of Neurosciences, KU Leuven, and Leuven Brain Institute (LBI), Leuven, Belgium
| |
Collapse
|
5
|
Zhang J, Xu W, Lei C, Pu Y, Zhang Y, Zhang J, Yu H, Su X, Huang Y, Gong R, Zhang L, Shi Q. Using WeChat clinician-patient group communication data to identify symptom burdens in patients with uterine fibroids under focused ultrasound ablation surgery treatment :Qualitative Study (Preprint).. [DOI: 10.2196/preprints.43995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
BACKGROUND
Unlike research project-based health data collections(questionnaires, interviews), social media platforms, which allow patients to freely discuss their health status and obtain peer support.Previous literature has pointed out that both public and private social platforms can serve as data sources for analysing.
OBJECTIVE
This study aimed to use natural language processing (NLP) techniques to identify concerns regarding the postoperative quality of life and symptom burdens in uterine fibroids after focused ultrasound ablation surgery.
METHODS
Screenshots taken from the clinician-patient WeChat groups were converted into free texts using image text recognition technology and used as the research object of this study, which used regular expressions in Python to search for symptom burdens in over 900,000 words of WeChat group-chats associated with 408 patients in Chongqing Haifu Hospital diagnosed with uterine fibroids between 2010 and 2020. We first built a corpus of symptoms by manually coding 30% of the WeChat texts, and then used regular expressions to crawl symptom information from the remaining texts based on this corpus. We compared the results with a manual review (gold standard) of the same records. Then we analyzed the relationship between the population baseline data and conceptual symptoms, Quantitative and qualitative results were examined.
RESULTS
A total of 190,000 words of uterine fibroids patients' free text were finally obtained after data cleaning. A total of 408 patients were included in the study. The age of the patients was 39.94±6.81 years, and their BMI was 22.18±2.78 (kg/m^2). The median reporting times of the seven major symptoms were 21, 26, 57, 2, 18, 30, and 49 days. Results showed that patients with dysmenorrhea were younger(mean 38.26 (SD 7.05), P=.004) and slimmer (mean 22.37 (SD 3.81), P=.04), with lower fertility and parity (P<.05), and tended to stay longer in the hospital (P<.05). Logistic regression models identified preoperative menstrual duration (OR 1.14, 95% CI 5.86-6.37; P= .009), age of menophania (OR -1.02 ,95%CI 11.96-13.47,P=.03), and the number(OR 2.34,95% CI 1.45-1.83,P=.04) and size of fibroids(OR 0.12,95% CI 2.43-3.51,P=.04) as significant risk factors for postoperative symptoms.
CONCLUSIONS
Unstructured free texts from social media platforms extracted by NLP technology can be used for analysis, extracting the conceptual information about patients' HRQol,adopt personalized treatment for patients at different stages of recovery to improve the quality of life of patients. Python-based text mining of free-text data can accurately extract symptom burden administered and save considerable time compared to manual review, maximizing the utility of the extant information in population-based electronic health records for comparative effectiveness research.
CLINICALTRIAL
Collapse
|
6
|
Segura-Bedmar I, Camino-Perdones D, Guerrero-Aspizua S. Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts. BMC Bioinformatics 2022; 23:263. [PMID: 35794528 PMCID: PMC9258216 DOI: 10.1186/s12859-022-04810-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 06/21/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background and objective
Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments.
Methods
The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms).
Results
BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%.
Conclusions
While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms).
Collapse
|
7
|
Yates T, Lain A, Campbell J, FitzPatrick DR, Simpson TI. Creation and evaluation of full-text literature-derived, feature-weighted disease models of genetically determined developmental disorders. Database (Oxford) 2022; 2022:baac038. [PMID: 35670729 PMCID: PMC9216525 DOI: 10.1093/database/baac038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/26/2022] [Accepted: 05/25/2022] [Indexed: 11/24/2022]
Abstract
There are >2500 different genetically determined developmental disorders (DD), which, as a group, show very high levels of both locus and allelic heterogeneity. This has led to the wide-spread use of evidence-based filtering of genome-wide sequence data as a diagnostic tool in DD. Determining whether the association of a filtered variant at a specific locus is a plausible explanation of the phenotype in the proband is crucial and commonly requires extensive manual literature review by both clinical scientists and clinicians. Access to a database of weighted clinical features extracted from rigorously curated literature would increase the efficiency of this process and facilitate the development of robust phenotypic similarity metrics. However, given the large and rapidly increasing volume of published information, conventional biocuration approaches are becoming impractical. Here, we present a scalable, automated method for the extraction of categorical phenotypic descriptors from the full-text literature. Papers identified through literature review were downloaded and parsed using the Cadmus custom retrieval package. Human Phenotype Ontology terms were extracted using MetaMap, with 76-84% precision and 65-73% recall. Mean terms per paper increased from 9 in title + abstract, to 68 using full text. We demonstrate that these literature-derived disease models plausibly reflect true disease expressivity more accurately than widely used manually curated models, through comparison with prospectively gathered data from the Deciphering Developmental Disorders study. The area under the curve for receiver operating characteristic (ROC) curves increased by 5-10% through the use of literature-derived models. This work shows that scalable automated literature curation increases performance and adds weight to the need for this strategy to be integrated into informatic variant analysis pipelines. Database URL: https://doi.org/10.1093/database/baac038.
Collapse
Affiliation(s)
- T.M Yates
- MRC Human Genetics Unit, Western General Hospital, Institute of Genetics and Cancer, The University of Edinburgh, Crewe Road South, Edinburgh EH4 2XU, UK
- Transforming Genetic Medicine Initiative, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - A Lain
- Institute for Adaptive and Neural Computation, Informatics Forum, The University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK
| | - J Campbell
- MRC Human Genetics Unit, Western General Hospital, Institute of Genetics and Cancer, The University of Edinburgh, Crewe Road South, Edinburgh EH4 2XU, UK
- Simons Initiative for the Developing Brain, The University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XF, UK
| | - D R FitzPatrick
- MRC Human Genetics Unit, Western General Hospital, Institute of Genetics and Cancer, The University of Edinburgh, Crewe Road South, Edinburgh EH4 2XU, UK
- Transforming Genetic Medicine Initiative, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Simons Initiative for the Developing Brain, The University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XF, UK
| | - T I Simpson
- Institute for Adaptive and Neural Computation, Informatics Forum, The University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK
- Simons Initiative for the Developing Brain, The University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XF, UK
| |
Collapse
|