1
|
Agurto C, Cecchi G, King S, Eyigoz EK, Parvaz MA, Alia-Klein N, Goldstein RZ. Speak and you shall predict: evidence that speech at initial cocaine abstinence is a biomarker of long-term drug use behavior. Biol Psychiatry 2025:S0006-3223(25)00031-9. [PMID: 39842704 DOI: 10.1016/j.biopsych.2025.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 12/23/2024] [Accepted: 01/12/2025] [Indexed: 01/24/2025]
Abstract
BACKGROUND Valid scalable biomarkers for predicting longitudinal clinical outcomes in psychiatric research are crucial for optimizing intervention and prevention efforts. Here we recorded spontaneous speech from initially abstinent individuals with cocaine use disorder (iCUD) for use in predicting drug use outcomes. METHODS At baseline, 88 iCUD provided 5-minute speech samples describing the positive consequences of quitting drug use and negative consequences of using drugs. Outcomes, including withdrawal, craving, abstinence days, and recent cocaine use, were assessed at three-month intervals up to one year (57 iCUD included in analyses). Predictive modeling compared natural language processing (NLP) techniques, specifically sentence embeddings with established inventories as targets, with models utilizing standard demographic and baseline psychometric variables. RESULTS At short time intervals, maximal predictive power was obtained with non-NLP models that also incorporated the same drug use measures (as the outcomes) obtained at baseline, potentially reflecting their slow rate of change, which could be estimated by linear functions. However, for longer-term predictions, speech samples alone demonstrated statistically significant results, with Spearman r ≥ 0.46 and 80% accuracy for predicting abstinence. Hence speech samples may capture non-linear dynamics over extended intervals more effectively than traditional measures. These results need to be replicated in larger and independent samples. CONCLUSIONS Compared to the common outcome measures used in clinical trials, speech-based measures could be leveraged as better predictors of longitudinal drug use outcomes in initially abstinent iCUD, as potentially generalizable to other subgroups with cocaine addiction, and to additional substance use disorders and related comorbidity.
Collapse
Affiliation(s)
- Carla Agurto
- IBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598
| | | | - Sarah King
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029; Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
| | - Elif K Eyigoz
- IBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598
| | - Muhammad A Parvaz
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029; Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029; Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York City, NY, 10029
| | - Nelly Alia-Klein
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029; Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
| | - Rita Z Goldstein
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029; Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029.
| |
Collapse
|
2
|
Scherbakov DA, Hubig NC, Lenert LA, Alekseyenko AV, Obeid JS. Natural Language Processing and Social Determinants of Health in Mental Health Research: AI-Assisted Scoping Review. JMIR Ment Health 2025; 12:e67192. [PMID: 39819656 DOI: 10.2196/67192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Revised: 11/27/2024] [Accepted: 11/28/2024] [Indexed: 01/19/2025] Open
Abstract
Background The use of natural language processing (NLP) in mental health research is increasing, with a wide range of applications and datasets being investigated. Objective This review aims to summarize the use of NLP in mental health research, with a special focus on the types of text datasets and the use of social determinants of health (SDOH) in NLP projects related to mental health. Methods The search was conducted in September 2024 using a broad search strategy in PubMed, Scopus, and CINAHL Complete. All citations were uploaded to Covidence (Veritas Health Innovation) software. The screening and extraction process took place in Covidence with the help of a custom large language model (LLM) module developed by our team. This LLM module was calibrated and tuned to automate many aspects of the review process. Results The screening process, assisted by the custom LLM, led to the inclusion of 1768 studies in the final review. Most of the reviewed studies (n=665, 42.8%) used clinical data as their primary text dataset, followed by social media datasets (n=523, 33.7%). The United States contributed the highest number of studies (n=568, 36.6%), with depression (n=438, 28.2%) and suicide (n=240, 15.5%) being the most frequently investigated mental health issues. Traditional demographic variables, such as age (n=877, 56.5%) and gender (n=760, 49%), were commonly extracted, while SDOH factors were less frequently reported, with urban or rural status being the most used (n=19, 1.2%). Over half of the citations (n=826, 53.2%) did not provide clear information on dataset accessibility, although a sizable number of studies (n=304, 19.6%) made their datasets publicly available. Conclusions This scoping review underscores the significant role of clinical notes and social media in NLP-based mental health research. Despite the clear relevance of SDOH to mental health, their underutilization presents a gap in current research. This review can be a starting point for researchers looking for an overview of mental health projects using text data. Shared datasets could be used to place more emphasis on SDOH in future studies.
Collapse
Affiliation(s)
- Dmitry A Scherbakov
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, United States
| | - Nina C Hubig
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, United States
- Interdisciplinary Transformation University, Linz, Austria
| | - Leslie A Lenert
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, United States
| | - Alexander V Alekseyenko
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, United States
| | - Jihad S Obeid
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, United States
| |
Collapse
|
3
|
Ehrett C, Hegde S, Andre K, Liu D, Wilson T. Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024; 10:e51433. [PMID: 39560937 PMCID: PMC11590755 DOI: 10.2196/51433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 02/09/2024] [Accepted: 08/15/2024] [Indexed: 11/20/2024]
Abstract
Background Generative large language models (LLMs) have the potential to revolutionize medical education by generating tailored learning materials, enhancing teaching efficiency, and improving learner engagement. However, the application of LLMs in health care settings, particularly for augmenting small datasets in text classification tasks, remains underexplored, particularly for cost- and privacy-conscious applications that do not permit the use of third-party services such as OpenAI's ChatGPT. Objective This study aims to explore the use of open-source LLMs, such as Large Language Model Meta AI (LLaMA) and Alpaca models, for data augmentation in a specific text classification task related to hospital staff surveys. Methods The surveys were designed to elicit narratives of everyday adaptation by frontline radiology staff during the initial phase of the COVID-19 pandemic. A 2-step process of data augmentation and text classification was conducted. The study generated synthetic data similar to the survey reports using 4 generative LLMs for data augmentation. A different set of 3 classifier LLMs was then used to classify the augmented text for thematic categories. The study evaluated performance on the classification task. Results The overall best-performing combination of LLMs, temperature, classifier, and number of synthetic data cases is via augmentation with LLaMA 7B at temperature 0.7 with 100 augments, using Robustly Optimized BERT Pretraining Approach (RoBERTa) for the classification task, achieving an average area under the receiver operating characteristic (AUC) curve of 0.87 (SD 0.02; ie, 1 SD). The results demonstrate that open-source LLMs can enhance text classifiers' performance for small datasets in health care contexts, providing promising pathways for improving medical education processes and patient care practices. Conclusions The study demonstrates the value of data augmentation with open-source LLMs, highlights the importance of privacy and ethical considerations when using LLMs, and suggests future directions for research in this field.
Collapse
Affiliation(s)
- Carl Ehrett
- Watt Family Innovation Center, Clemson University, Clemson, SC, United States
| | - Sudeep Hegde
- Department of Industrial Engineering, Clemson University, Clemson, SC, United States
| | - Kwame Andre
- Department of Computer Science, Clemson University, Clemson, SC, United States
| | - Dixizi Liu
- Department of Industrial Engineering, Clemson University, Clemson, SC, United States
| | - Timothy Wilson
- Department of Industrial Engineering, Clemson University, Clemson, SC, United States
| |
Collapse
|
4
|
Rohanian O, Nouriborji M, Jauncey H, Kouchaki S, Nooralahzadeh F, Clifton L, Merson L, Clifton DA. Lightweight transformers for clinical natural language processing. NATURAL LANGUAGE ENGINEERING 2024; 30:887-914. [PMID: 39588066 PMCID: PMC11586117 DOI: 10.1017/s1351324923000542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 11/23/2023] [Accepted: 11/26/2023] [Indexed: 11/27/2024]
Abstract
Specialised pre-trained language models are becoming more frequent in Natural language Processing (NLP) since they can potentially outperform models trained on generic texts. BioBERT (Sanh et al., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv: 1910.01108, 2019) and BioClinicalBERT (Alsentzer et al., Publicly available clinical bert embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp. 72-78, 2019) are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like knowledge distillation, it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries, etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from million to million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including natural language inference, relation extraction, named entity recognition and sequence classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at https://huggingface.co/nlpie and Github page at https://github.com/nlpie-research/Lightweight-Clinical-Transformers, respectively, promoting reproducibility of our results.
Collapse
Affiliation(s)
- Omid Rohanian
- Department of Engineering Science, University of Oxford, Oxford, UK
- NLPie Research, Oxford, UK
| | | | - Hannah Jauncey
- Infectious Diseases Data Observatory (IDDO), University of Oxford, Oxford, UK
| | - Samaneh Kouchaki
- Department of Electrical and Electronic Engineering, University of Surrey, Guildford, UK
| | - Farhad Nooralahzadeh
- University of Zürich, Zürich, Switzerland
- University Hospital of Zürich, Zürich, Switzerland
| | | | - Lei Clifton
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Laura Merson
- ISARIC, Pandemic Sciences Institute, University of Oxford, Oxford, UK
| | - David A. Clifton
- Department of Engineering Science, University of Oxford, Oxford, UK
- Oxford-Suzhou Centre for Advanced Research, Suzhou, China
| |
Collapse
|
5
|
Mahbub M, Goethert I, Danciu I, Knight K, Srinivasan S, Tamang S, Rozenberg-Ben-Dror K, Solares H, Martins S, Trafton J, Begoli E, Peterson GD. Question-answering system extracts information on injection drug use from clinical notes. COMMUNICATIONS MEDICINE 2024; 4:61. [PMID: 38570620 PMCID: PMC10991373 DOI: 10.1038/s43856-024-00470-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 02/29/2024] [Indexed: 04/05/2024] Open
Abstract
BACKGROUND Injection drug use (IDU) can increase mortality and morbidity. Therefore, identifying IDU early and initiating harm reduction interventions can benefit individuals at risk. However, extracting IDU behaviors from patients' electronic health records (EHR) is difficult because there is no other structured data available, such as International Classification of Disease (ICD) codes, and IDU is most often documented in unstructured free-text clinical notes. Although natural language processing can efficiently extract this information from unstructured data, there are no validated tools. METHODS To address this gap in clinical information, we design a question-answering (QA) framework to extract information on IDU from clinical notes for use in clinical operations. Our framework involves two main steps: (1) generating a gold-standard QA dataset and (2) developing and testing the QA model. We use 2323 clinical notes of 1145 patients curated from the US Department of Veterans Affairs (VA) Corporate Data Warehouse to construct the gold-standard dataset for developing and evaluating the QA model. We also demonstrate the QA model's ability to extract IDU-related information from temporally out-of-distribution data. RESULTS Here, we show that for a strict match between gold-standard and predicted answers, the QA model achieves a 51.65% F1 score. For a relaxed match between the gold-standard and predicted answers, the QA model obtains a 78.03% F1 score, along with 85.38% Precision and 79.02% Recall scores. Moreover, the QA model demonstrates consistent performance when subjected to temporally out-of-distribution data. CONCLUSIONS Our study introduces a QA framework designed to extract IDU information from clinical notes, aiming to enhance the accurate and efficient detection of people who inject drugs, extract relevant information, and ultimately facilitate informed patient care.
Collapse
Affiliation(s)
- Maria Mahbub
- Cyber Resilience and Intelligence Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
| | - Ian Goethert
- Information Technology Services Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Ioana Danciu
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - Kathryn Knight
- Information Technology Services Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Sudarshan Srinivasan
- Cyber Resilience and Intelligence Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Suzanne Tamang
- Program Evaluation and Resource Center, Office of Mental Health and Suicide Prevention, Department of Veterans Affairs, Menlo Park, CA, USA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Hugo Solares
- Program Evaluation and Resource Center, Office of Mental Health and Suicide Prevention, Department of Veterans Affairs, Menlo Park, CA, USA
| | - Susana Martins
- Program Evaluation and Resource Center, Office of Mental Health and Suicide Prevention, Department of Veterans Affairs, Menlo Park, CA, USA
| | - Jodie Trafton
- Program Evaluation and Resource Center, Office of Mental Health and Suicide Prevention, Department of Veterans Affairs, Menlo Park, CA, USA
| | - Edmon Begoli
- Cyber Resilience and Intelligence Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Gregory D Peterson
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Knoxville, TN, USA
| |
Collapse
|
6
|
McDaniel BT, Cornet V, Carroll J, Chrones L, Chudzik J, Cochran J, Guha S, Lawrence DF, McCue M, Sarkey S, Lorenz B, Fawver J. Real-world clinical outcomes and treatment patterns in patients with MDD treated with vortioxetine: a retrospective study. BMC Psychiatry 2023; 23:938. [PMID: 38093196 PMCID: PMC10720213 DOI: 10.1186/s12888-023-05439-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 12/04/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND This study included evaluation of the effectiveness of vortioxetine, a treatment for adults with major depressive disorder (MDD), using patient-reported outcome measures (PROMs) in a real-world setting. METHODS This retrospective chart review analyzed the care experiences of adult patients with a diagnosis of MDD from Parkview Physicians Group - Mind-Body Medicine, Midwestern United States. Patients with a prescription for vortioxetine, an initial baseline visit, and ≥ 2 follow-up visits within 16 weeks from September 2014 to December 2018 were included. The primary outcome measure was effectiveness of vortioxetine on depression severity as assessed by change in Patient Health Questionnaire-9 (PHQ-9) scores ~ 12 weeks after initiation of vortioxetine. Secondary outcomes included changes in depression-related symptoms (i.e., sexual dysfunction, sleep disturbance, cognitive function, work/social function), clinical characteristics, response, remission, and medication persistence. Clinical narrative notes were also analyzed to examine sleep disturbance, sexual dysfunction, appetite, absenteeism, and presenteeism. All outcomes were examined at index (start of vortioxetine) and at ~ 12 weeks, and mean differences were analyzed using pairwise t tests. RESULTS A total of 1242 patients with MDD met inclusion criteria, and 63.9% of these patients had ≥ 3 psychiatric diagnoses and 65.9% were taking ≥ 3 medications. PHQ-9 mean scores decreased significantly from baseline to week 12 (14.15 ± 5.8 to 9.62 ± 6.03, respectively; p < 0.001). At week 12, the response and remission rates in all patients were 31.0% and 23.1%, respectively, and 67% continued vortioxetine treatment. Overall, results also showed significant improvements by week 12 in anxiety (p < 0.001), sexual dysfunction (p < 0.01), sleep disturbance (p < 0.01), cognitive function (p < 0.001), work/social functioning (p = 0.021), and appetite (p < 0.001). A significant decrease in presenteeism was observed at week 12 (p < 0.001); however, no significant change was observed in absenteeism (p = 0.466). CONCLUSIONS Using PROMs, our study results suggest that adults with MDD prescribed vortioxetine showed improvement in depressive symptoms in the context of a real-world clinical practice setting. These patients had multiple comorbid psychiatric and physical diagnoses and multiple previous antidepressant treatments had failed.
Collapse
Affiliation(s)
- Brandon T McDaniel
- Parkview Mirro Center for Research and Innovation, 10622 Parkview Plaza Drive, Fort Wayne, IN, 46845, US
| | - Victor Cornet
- Parkview Mirro Center for Research and Innovation, 10622 Parkview Plaza Drive, Fort Wayne, IN, 46845, US
| | - Jeanne Carroll
- Parkview Mirro Center for Research and Innovation, 10622 Parkview Plaza Drive, Fort Wayne, IN, 46845, US
| | | | - Joseph Chudzik
- Parkview Mirro Center for Research and Innovation, 10622 Parkview Plaza Drive, Fort Wayne, IN, 46845, US
| | - Jeanette Cochran
- Parkview Physicians Group - Mind-Body Medicine, Fort Wayne, IN, US
| | - Shion Guha
- Parkview Mirro Center for Research and Innovation, 10622 Parkview Plaza Drive, Fort Wayne, IN, 46845, US
- Faculty of Information, Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | | | - Maggie McCue
- Takeda Pharmaceuticals U.S.A., Inc, Lexington, MA, US
| | - Sara Sarkey
- Takeda Pharmaceuticals U.S.A., Inc, Lexington, MA, US
| | - Betty Lorenz
- Takeda Pharmaceuticals U.S.A., Inc, Lexington, MA, US
| | - Jay Fawver
- Parkview Physicians Group - Mind-Body Medicine, Fort Wayne, IN, US.
| |
Collapse
|
7
|
Agurto C, Cecchi G, King S, Eyigoz EK, Parvaz MA, Alia-Klein N, Goldstein RZ. Speak and you shall predict: speech at initial cocaine abstinence as a biomarker of long-term drug use behavior. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.18.549548. [PMID: 37503140 PMCID: PMC10370100 DOI: 10.1101/2023.07.18.549548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Importance Valid biomarkers that can predict longitudinal clinical outcomes at low cost are a holy grail in psychiatric research, promising to ultimately be used to optimize and tailor intervention and prevention efforts. Objective To determine if baseline linguistic markers in natural speech, as compared to non-speech clinical and demographic measures, can predict drug use severity measures at future sessions in initially abstinent individuals with cocaine use disorder (iCUD). Design A longitudinal cohort study (August 2017 - March 2020), where baseline measures were used to predict outcomes collected at three-month intervals for up to one year of follow-up. Participants Eighty-eight initially abstinent iCUD were studied at baseline; 57 (46 male, age 50.7+/-7.9 years) came back for at least another session. Main Outcomes and Measures Outcomes were self-reported symptoms of withdrawal, craving, abstinence duration and frequency of cocaine use in the past 90 days at each study session. The predictors were derived from 5-min recordings of vocal descriptions of the positive consequences of abstinence and the negative consequences of using cocaine; the baseline cocaine and other common drug use measures, demographic and neuropsychological variables were used for comparison. Results Models using the non-speech variables showed the best predictive performance at three(r>0.45, P<2×10-3) and six months follow-up (r>0.37, P<3×10-2). At 12 months, the natural language processing-based model showed significant correlations with withdrawal (r=0.43, P=3×10-2), craving (r=0.72, P=5×10-5), days of abstinence (r=0.76, P=1×10-5), and cocaine use in the past 90 days (r=0.61, P=2×10-3), significantly outperforming the other models for abstinence prediction. Conclusions and Relevance At short time intervals, maximal predictive power was obtained with models that used baseline drug use (in addition to demographic and neuropsychological) measures, potentially reflecting a slow rate of change in these measures, which could be estimated by linear functions. In contrast, short speech samples predicted longer-term changes in drug use, implying deeper penetrance by potentially capturing non-linear dynamics over longer intervals. Results suggest that, compared to the common outcome measures used in clinical trials, speech-based measures could be leveraged as better predictors of longitudinal drug use outcomes in initially abstinent iCUD, as potentially generalizable to other substance use disorders and related comorbidity.
Collapse
Affiliation(s)
- Carla Agurto
- IBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598
| | | | - Sarah King
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
| | - Elif K. Eyigoz
- IBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598
| | - Muhammad A. Parvaz
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
- Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York City, NY, 10029
| | - Nelly Alia-Klein
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
| | - Rita Z. Goldstein
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
- Psychiatry and Neuroscience Departments, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York City, NY, 10029
| |
Collapse
|
8
|
Wu CS, Chen CH, Su CH, Chien YL, Dai HJ, Chen HH. Augmenting DSM-5 diagnostic criteria with self-attention-based BiLSTM models for psychiatric diagnosis. Artif Intell Med 2023; 136:102488. [PMID: 36710066 DOI: 10.1016/j.artmed.2023.102488] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 11/20/2022] [Accepted: 01/09/2023] [Indexed: 01/12/2023]
Abstract
BACKGROUND Most previous studies make psychiatric diagnoses based on diagnostic terms. In this study we sought to augment Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5) diagnostic criteria with deep neural network models to make psychiatric diagnoses based on psychiatric notes. METHODS We augmented DSM-5 diagnostic criteria with self-attention-based bidirectional long short-term memory (BiLSTM) models to identify schizophrenia, bipolar, and unipolar depressive disorders. Given that the diagnostic criteria for psychiatric diagnosis include a certain symptom profile and functional impairment, we first extracted psychiatric symptoms and functional features with two approaches, including a lexicon-based approach and a dependency parsing approach. Then, we incorporated free-text discharge notes and extracted features for psychiatric diagnoses with the proposed models. RESULTS The micro-averaged F1 scores of the two automatic annotation approaches were greater than 0.8. BiLSTM models with self-attention outperformed the rule-based models with DSM-5 criteria in the prediction of schizophrenia and bipolar disorder, while the latter outperformed the former in predicting unipolar depressive disorder. Approaches for augmenting DSM-5 criteria with a self-attention-based BiLSTM outperformed both pure rule-based and pure deep neural network models. In terms of classification of psychiatric diagnoses, we observed that the performance for schizophrenia and bipolar disorder was acceptable. CONCLUSION This DSM-5-augmented deep neural network models showed good performance in identifying psychiatric diagnoses from psychiatric notes. We conclude that it is possible to establish a model that consults clinical notes to make psychiatric diagnoses comparably to physicians. Further research will be extended to outpatient notes and other psychiatric disorders.
Collapse
Affiliation(s)
- Chi-Shin Wu
- National Center for Geriatrics and Welfare Research, National Health Research Institutes, Zhunan, Taiwan; Department of Psychiatry, National Taiwan University Hospital, Yunlin branch, Douliu, Taiwan
| | - Chien-Hung Chen
- Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan
| | - Chu-Hsien Su
- National Center for Geriatrics and Welfare Research, National Health Research Institutes, Zhunan, Taiwan
| | - Yi-Ling Chien
- Department of Psychiatry, National Taiwan University Hospital, Taipei, Taiwan
| | - Hong-Jie Dai
- Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan; School of Post-Baccalaureate Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan; National Institute of Cancer Research, National Health Research Institutes, Tainan, Taiwan
| | - Hsin-Hsi Chen
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
9
|
Wang L, Zhang Y, Chignell M, Shan B, Sheehan KA, Razak F, Verma A. Boosting Delirium Identification Accuracy With Sentiment-Based Natural Language Processing: Mixed Methods Study. JMIR Med Inform 2022; 10:e38161. [PMID: 36538363 PMCID: PMC9812273 DOI: 10.2196/38161] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 08/22/2022] [Accepted: 09/19/2022] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Delirium is an acute neurocognitive disorder that affects up to half of older hospitalized medical patients and can lead to dementia, longer hospital stays, increased health costs, and death. Although delirium can be prevented and treated, it is difficult to identify and predict. OBJECTIVE This study aimed to improve machine learning models that retrospectively identify the presence of delirium during hospital stays (eg, to measure the effectiveness of delirium prevention interventions) by using the natural language processing (NLP) technique of sentiment analysis (in this case a feature that identifies sentiment toward, or away from, a delirium diagnosis). METHODS Using data from the General Medicine Inpatient Initiative, a Canadian hospital data and analytics network, a detailed manual review of medical records was conducted from nearly 4000 admissions at 6 Toronto area hospitals. Furthermore, 25.74% (994/3862) of the eligible hospital admissions were labeled as having delirium. Using the data set collected from this study, we developed machine learning models with, and without, the benefit of NLP methods applied to diagnostic imaging reports, and we asked the question "can NLP improve machine learning identification of delirium?" RESULTS Among the eligible 3862 hospital admissions, 994 (25.74%) admissions were labeled as having delirium. Identification and calibration of the models were satisfactory. The accuracy and area under the receiver operating characteristic curve of the main model with NLP in the independent testing data set were 0.807 and 0.930, respectively. The accuracy and area under the receiver operating characteristic curve of the main model without NLP in the independent testing data set were 0.811 and 0.869, respectively. Model performance was also found to be stable over the 5-year period used in the experiment, with identification for a likely future holdout test set being no worse than identification for retrospective holdout test sets. CONCLUSIONS Our machine learning model that included NLP (ie, sentiment analysis in medical image description text mining) produced valid identification of delirium with the sentiment analysis, providing significant additional benefit over the model without NLP.
Collapse
Affiliation(s)
- Lu Wang
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, Texas State University, San Marcos, TX, United States
| | - Yilun Zhang
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, ON, Canada
| | - Mark Chignell
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, ON, Canada
| | - Baizun Shan
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, ON, Canada
| | - Kathleen A Sheehan
- GEMINI - The General Medicine Inpatient Initiative, Unity Health Toronto, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| | - Fahad Razak
- GEMINI - The General Medicine Inpatient Initiative, Unity Health Toronto, Toronto, ON, Canada
- Faculty of Medicine & Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
| | - Amol Verma
- GEMINI - The General Medicine Inpatient Initiative, Unity Health Toronto, Toronto, ON, Canada
- Faculty of Medicine & Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
10
|
Stewart de Ramirez S, Shallat J, McClure K, Foulger R, Barenblat L. Screening for Social Determinants of Health: Active and Passive Information Retrieval Methods. Popul Health Manag 2022; 25:781-788. [PMID: 36454231 DOI: 10.1089/pop.2022.0228] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Screening for social determinants of health (SDOH) is recommended, but numerous barriers exist to implementing SDOH screening in clinical spaces. In this study, the authors identified how both active and passive information retrieval methods may be used in clinical spaces to screen for SDOH and meet patient needs. The authors conducted a retrospective sequential cohort analysis comparing the active identification of SDOH through a patient-led digital manual screening process completed in primary care offices from September 2019 to January 2020 and passive identification of SDOH through natural language processing (NLP) from September 2016 to August 2018, among 1735 patients at a large midwestern tertiary referral hospital system and its associated outlying primary care and outpatient facilities. The percent of patients identified by both the passive and active identification methods as experiencing SDOH varied from 0.3% to 4.7%. The active identification method identified social integration, domestic safety, financial resources, food insecurity, transportation, housing, and stress in proportions ranging from 5% to 36%. The passive method contributed to the identification of financial resource issues and stress, identifying 9.6% and 3% of patients to be experiencing these issues, respectively. SDOH documentation varied by provider type. The combination of passive and active SDOH screening methods can provide a more comprehensive picture by leveraging historic patient interactions, while also eliciting current patient needs. Using passive, NLP-based methods to screen for SDOH will also help providers overcome barriers that have historically prevented screening.
Collapse
Affiliation(s)
- Sarah Stewart de Ramirez
- Department of Population Health Services, OSF HealthCare System, Peoria, Illinois, USA.,Department of Emergency Medicine, University of Illinois College of Medicine at Peoria, Peoria, Illinois, USA
| | - Jaclyn Shallat
- Department of Epidemiology and Biostatistics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Keaton McClure
- University of Illinois College of Medicine at Peoria, Peoria, Illinois, USA
| | - Roopa Foulger
- Department of Health Care Analytics, OSF HealthCare System, Peoria, Illinois, USA.,Department of OSF OnCall, OSF Healthcare System, Peoria, Illinois, USA
| | | |
Collapse
|
11
|
Ridgway JP, Ajith A, Friedman EE, Mugavero MJ, Kitahata MM, Crane HM, Moore RD, Webel A, Cachay ER, Christopoulos KA, Mayer KH, Napravnik S, Mayampurath A. Multicenter Development and Validation of a Model for Predicting Retention in Care Among People with HIV. AIDS Behav 2022; 26:3279-3288. [PMID: 35394586 PMCID: PMC9474706 DOI: 10.1007/s10461-022-03672-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2022] [Indexed: 11/26/2022]
Abstract
Predictive analytics can be used to identify people with HIV currently retained in care who are at risk for future disengagement from care, allowing for prioritization of retention interventions. We utilized machine learning methods to develop predictive models of retention in care, defined as no more than a 12 month gap between HIV care appointments in the Center for AIDS Research Network of Integrated Clinical Systems (CNICS) cohort. Data were split longitudinally into derivation and validation cohorts. We created logistic regression (LR), random forest (RF), and gradient boosted machine (XGB) models within a discrete-time survival analysis framework and compared their performance to a baseline model that included only demographics, viral suppression, and retention history. 21,267 Patients with 507,687 visits from 2007 to 2018 were included. The LR model outperformed the baseline model (AUC 0.68 [0.67-0.70] vs. 0.60 [0.59-0.62], P < 0.001). RF and XGB models had similar performance to the LR model. Top features in the LR model included retention history, age, and viral suppression.
Collapse
Affiliation(s)
- Jessica P Ridgway
- Department of Medicine, University of Chicago, 5841 S Maryland Ave, MC 5065, Chicago, IL, 60637, USA.
| | - Aswathy Ajith
- Center for Research Informatics, University of Chicago, Chicago, IL, USA
| | - Eleanor E Friedman
- Department of Medicine, University of Chicago, 5841 S Maryland Ave, MC 5065, Chicago, IL, 60637, USA
| | | | - Mari M Kitahata
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Heidi M Crane
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Richard D Moore
- Department of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Allison Webel
- Frances Payne Bolton School of Nursing, Case Western Reserve University, Cleveland, OH, USA
| | - Edward R Cachay
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | | | | | - Sonia Napravnik
- Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | | |
Collapse
|
12
|
Mahmoudi E, Wu W, Najarian C, Aikens J, Bynum J, Vydiswaran VV. Identify Caregiver Availability Using Medical Notes: Rule-Based Natural Language Processing. JMIR Aging 2022; 5:e40241. [PMID: 35998328 PMCID: PMC9539648 DOI: 10.2196/40241] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 07/28/2022] [Accepted: 08/16/2022] [Indexed: 11/23/2022] Open
Abstract
Background Identifying caregiver availability, particularly for patients with dementia or those with a disability, is critical to informing the appropriate care planning by the health systems, hospitals, and providers. This information is not readily available, and there is a paucity of pragmatic approaches to automatically identifying caregiver availability and type. Objective Our main objective was to use medical notes to assess caregiver availability and type for hospitalized patients with dementia. Our second objective was to identify whether the patient lived at home or resided at an institution. Methods In this retrospective cohort study, we used 2016-2019 telephone-encounter medical notes from a single institution to develop a rule-based natural language processing (NLP) algorithm to identify the patient’s caregiver availability and place of residence. Using note-level data, we compared the results of the NLP algorithm with human-conducted chart abstraction for both training (749/976, 77%) and test sets (227/976, 23%) for a total of 223 adults aged 65 years and older diagnosed with dementia. Our outcomes included determining whether the patients (1) reside at home or in an institution, (2) have a formal caregiver, and (3) have an informal caregiver. Results Test set results indicated that our NLP algorithm had high level of accuracy and reliability for identifying whether patients had an informal caregiver (F1=0.94, accuracy=0.95, sensitivity=0.97, and specificity=0.93), but was relatively less able to identify whether the patient lived at an institution (F1=0.64, accuracy=0.90, sensitivity=0.51, and specificity=0.98). The most common explanations for NLP misclassifications across all categories were (1) incomplete or misspelled facility names; (2) past, uncertain, or undecided status; (3) uncommon abbreviations; and (4) irregular use of templates. Conclusions This innovative work was the first to use medical notes to pragmatically determine caregiver availability. Our NLP algorithm identified whether hospitalized patients with dementia have a formal or informal caregiver and, to a lesser extent, whether they lived at home or in an institutional setting. There is merit in using NLP to identify caregivers. This study serves as a proof of concept. Future work can use other approaches and further identify caregivers and the extent of their availability.
Collapse
Affiliation(s)
- Elham Mahmoudi
- Department of Family Medicine, Medical School, University of Michigan, Institute for healthcare Policy and Innovation, University of Michigan, NCRC Building 14, Room G2342800 Plymouth Rd., Ann Arbor, US
| | - Wenbo Wu
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, US
| | - Cyrus Najarian
- University of Michigan Medical School, University of Michigan, Ann Arbor, US
| | - James Aikens
- Department of Family Medicine, Medical School, University of Michigan, Ann Arbor, US
| | - Julie Bynum
- Medical School, University of Michigan, Ann Arbor, US
| | - Vg Vinod Vydiswaran
- Department of Learning Health Sciences, Medical School, University of Michigan, Ann Arbor, US
| |
Collapse
|
13
|
A comparison of methods to identify antenatal substance use within electronic health records. Am J Obstet Gynecol MFM 2022; 4:100535. [PMID: 34808402 PMCID: PMC8893715 DOI: 10.1016/j.ajogmf.2021.100535] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 11/04/2021] [Accepted: 11/15/2021] [Indexed: 11/22/2022]
|
14
|
Lea AN, Altschuler A, Leibowitz AS, Levine-Hall T, McNeely J, Silverberg MJ, Satre DD. Patient and provider perspectives on self-administered electronic substance use and mental health screening in HIV primary care. Addict Sci Clin Pract 2022; 17:10. [PMID: 35139911 PMCID: PMC8827178 DOI: 10.1186/s13722-022-00293-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 01/26/2022] [Indexed: 11/26/2022] Open
Abstract
Background Substance use disorders, depression and anxiety disproportionately affect people with HIV (PWH) and lead to increased morbidity and mortality. Routine screening can help address these problems but is underutilized. This study sought to describe patient and provider perspectives on the acceptability and usefulness of systematic electronic, self-administered screening for tobacco, alcohol, other substance use, and mental health symptoms among patients in HIV primary care. Methods Screening used validated instruments delivered pre-appointment by both secure messaging and clinic-based tablets, with results integrated into the electronic health record (EHR). Qualitative analysis of semi-structured interviews with 9 HIV primary care providers and 12 patients in the 3 largest HIV primary care clinics in the Kaiser Permanente Northern California health system who participated in a clinical trial evaluating computerized screening and behavioral interventions was conducted. Interviews were audio-recorded and transcribed. A thematic approach was utilized for coding and analysis of interview data using a combination of deductive and inductive methods. Results Four key themes were identified: (1) perceived clinical benefit of systematic, electronic screening and EHR integration for providers and patients; (2) usefulness of having multiple methods of questionnaire completion; (3) importance of the patient–provider relationship to facilitate completion and accurate reporting; and (4) barriers, include privacy and confidentiality concerns about reporting sensitive information, particularly about substance use, and potential burden from repeated screenings. Conclusions Findings suggest that electronic, self-administered substance use and mental health screening is acceptable to patients and may have clinical utility to providers. While offering different methods of screening completion can capture a wider range of patients, a strong patient–provider relationship is a key factor in overcoming barriers and ensuring accurate patient responses. Further investigation into facilitators, barriers, and utility of electronic screening for PWH and other high-priority patient populations is indicated. Trial registration ClinicalTrials.gov, NCT03217058. Registered 13 July 2017, https://clinicaltrials.gov/ct2/show/NCT03217058 Supplementary Information The online version contains supplementary material available at 10.1186/s13722-022-00293-7.
Collapse
Affiliation(s)
- Alexandra N Lea
- Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA.
| | - Andrea Altschuler
- Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA
| | - Amy S Leibowitz
- Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA
| | - Tory Levine-Hall
- Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA
| | - Jennifer McNeely
- Department of Population Health, Section on Tobacco, Alcohol, and Drug Use, New York University Grossman School of Medicine, 180 Madison Ave., New York, NY, 10016, USA
| | - Michael J Silverberg
- Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA
| | - Derek D Satre
- Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA.,Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California, San Francisco, 401 Parnassus Avenue, Box 0984, San Francisco, CA, 94143, USA
| |
Collapse
|
15
|
Wang K, Tan F, Zhu Z, Kong L. Exploring changes in depression and radiology-related publications research focus: A bibliometrics and content analysis based on natural language processing. Front Psychiatry 2022; 13:978763. [PMID: 36532194 PMCID: PMC9748702 DOI: 10.3389/fpsyt.2022.978763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 11/14/2022] [Indexed: 12/02/2022] Open
Abstract
OBJECTIVE This study aims to construct and use natural language processing and other methods to analyze major depressive disorder (MDD) and radiology studies' publications in the PubMed database to understand the historical growth, current state, and potential expansion trend. METHODS All MDD radiology studies publications from January 2002 to January 2022 were downloaded from PubMed using R, a statistical computing language. R and the interpretive general-purpose programming language Python were used to extract publication dates, geographic information, and abstracts from each publication's metadata for bibliometric analysis. The generative statistical algorithm "Latent Dirichlet allocation" (LDA) was applied to identify specific research focus and trends. The unsupervised Leuven algorithm was used to build a network to identify relationships between research focus. RESULTS A total of 5,566 publications on MDD and radiology research were identified, and there is a rapid upward trend. The top-cited publications were 11,042, and the highly-cited publications focused on improving diagnostic performance and establishing imaging standards. Publications came from 76 countries, with the most from research institutions in the United States and China. Hospitals and radiology departments take the lead in research and have an advantage. The extensive field of study contains 12,058 Medical Subject Heading (MeSH) terms. Based on the LDA algorithm, three areas were identified that have become the focus of research in recent years, "Symptoms and treatment," "Brain structure and imaging," and "Comorbidities research." CONCLUSION Latent Dirichlet allocation analysis methods can be well used to analyze many texts and discover recent research trends and focus. In the past 20 years, the research on MDD and radiology has focused on exploring MDD mechanisms, establishing standards, and constructing imaging methods. Recent research focuses are "Symptoms and sleep," "Brain structure study," and "functional connectivity." New progress may be made in studies on MDD complications and the combination of brain structure and metabolism.
Collapse
Affiliation(s)
- Kangtao Wang
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, Hunan, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Fengbo Tan
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, Hunan, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Zhiming Zhu
- Department of Radiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Lingyu Kong
- Department of Radiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
| |
Collapse
|
16
|
Bae JH, Han HW, Yang SY, Song G, Sa S, Chung GE, Seo JY, Jin EH, Kim H, An D. Development of a Natural Language Processing System for Assessing Quality Indicators from Free-Text Colonoscopy and Pathology Reports: Methodology Development and Applications (Preprint). JMIR Med Inform 2021; 10:e35257. [PMID: 35436226 PMCID: PMC9055472 DOI: 10.2196/35257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 02/13/2022] [Accepted: 02/25/2022] [Indexed: 12/25/2022] Open
Abstract
Background Manual data extraction of colonoscopy quality indicators is time and labor intensive. Natural language processing (NLP), a computer-based linguistics technique, can automate the extraction of important clinical information, such as adverse events, from unstructured free-text reports. NLP information extraction can facilitate the optimization of clinical work by helping to improve quality control and patient management. Objective We developed an NLP pipeline to analyze free-text colonoscopy and pathology reports and evaluated its ability to automatically assess adenoma detection rate (ADR), sessile serrated lesion detection rate (SDR), and postcolonoscopy surveillance intervals. Methods The NLP tool for extracting colonoscopy quality indicators was developed using a data set of 2000 screening colonoscopy reports from a single health care system, with an associated 1425 pathology reports. The NLP system was then tested on a data set of 1000 colonoscopy reports and its performance was compared with that of 5 human annotators. Additionally, data from 54,562 colonoscopies performed between 2010 and 2019 were analyzed using the NLP pipeline. Results The NLP pipeline achieved an overall accuracy of 0.99-1.00 for identifying polyp subtypes, 0.99-1.00 for identifying the anatomical location of polyps, and 0.98 for counting the number of neoplastic polyps. The NLP pipeline achieved performance similar to clinical experts for assessing ADR, SDR, and surveillance intervals. NLP analysis of a 10-year colonoscopy data set identified great individual variance in colonoscopy quality indicators among 25 endoscopists. Conclusions The NLP pipeline could accurately extract information from colonoscopy and pathology reports and demonstrated clinical efficacy for assessing ADR, SDR, and surveillance intervals in these reports. Implementation of the system enabled automated analysis and feedback on quality indicators, which could motivate endoscopists to improve the quality of their performance and improve clinical decision-making in colorectal cancer screening programs.
Collapse
Affiliation(s)
- Jung Ho Bae
- Department of Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
- Institute for Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hyun Wook Han
- Department of Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
- Institute for Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
| | - Sun Young Yang
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Gyuseon Song
- Department of Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
- Institute for Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
| | - Soonok Sa
- Department of Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
- Institute for Biomedical Informatics, CHA University School of Medicine, CHA University, Seongnam, Republic of Korea
| | - Goh Eun Chung
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Ji Yeon Seo
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Eun Hyo Jin
- Department of Internal Medicine and Healthcare Research Institute, Healthcare System Gangnam Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Heecheon Kim
- Miso Info Tech Co, Ltd, Seoul, Republic of Korea
| | - DongUk An
- Miso Info Tech Co, Ltd, Seoul, Republic of Korea
| |
Collapse
|
17
|
Zanotto BS, Beck da Silva Etges AP, Dal Bosco A, Cortes EG, Ruschel R, De Souza AC, Andrade CMV, Viegas F, Canuto S, Luiz W, Ouriques Martins S, Vieira R, Polanczyk C, André Gonçalves M. Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers. JMIR Med Inform 2021; 9:e29120. [PMID: 34723829 PMCID: PMC8593798 DOI: 10.2196/29120] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 06/27/2021] [Accepted: 08/05/2021] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management. OBJECTIVE This study aims to compare the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs. METHODS Our study addressed the computational problems of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: tier 1 (achieved health care status), tier 2 (recovery process), care related (clinical management and risk scores), and baseline characteristics. The analyzed data set was retrospectively extracted from the EMRs of patients with stroke from a private Brazilian hospital between 2018 and 2019. A total of 44,206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning methods, including state-of-the-art neural and nonneural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with subject-wise sampling. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1 score), supported by statistical significance tests. A feature importance analysis was conducted to provide insights into the results. RESULTS The top-performing models were support vector machines trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR textual representations. The support vector machine models produced statistically superior results in 71% (17/24) of tasks, with an F1 score >80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally or ambulate and communicate), health care status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional nonneural methods, given the characteristics of the data set. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future. CONCLUSIONS Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to clinical conditions of stroke victims, and thus ultimately assess the possibility of proactively using these machine learning techniques in real-world situations.
Collapse
Affiliation(s)
- Bruna Stella Zanotto
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,Graduate Program in Epidemiology, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Ana Paula Beck da Silva Etges
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,School of Technology, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Brazil
| | - Avner Dal Bosco
- School of Technology, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Brazil
| | - Eduardo Gabriel Cortes
- Graduate Program of Computer Science, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Renata Ruschel
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | - Claudio M V Andrade
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Felipe Viegas
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Sergio Canuto
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Washington Luiz
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | | | - Renata Vieira
- Centro Interdisciplinar de História, Culturas e Sociedades (CIDEHUS), Universidade de Évora, Évora, Portugal
| | - Carisi Polanczyk
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,Graduate Program in Epidemiology, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Marcos André Gonçalves
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
18
|
Machine Learning and Clinical Informatics for Improving HIV Care Continuum Outcomes. Curr HIV/AIDS Rep 2021; 18:229-236. [PMID: 33661445 DOI: 10.1007/s11904-021-00552-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2021] [Indexed: 10/22/2022]
Abstract
PURPOSE OF REVIEW This manuscript reviews the use of electronic medical record (EMR) data for HIV care and research along the HIV care continuum with a specific focus on machine learning methods and clinical informatics interventions. RECENT FINDINGS EMR-based clinical decision support tools and electronic alerts have been effectively utilized to improve HIV care continuum outcomes. Accurate EMR-based machine learning models have been developed to predict HIV diagnosis, retention in care, and viral suppression. Natural language processing (NLP) of clinical notes and data sharing between healthcare systems and public health agencies can enhance models for identifying people living with HIV who are undiagnosed or in need of relinkage to care. Challenges related to using these technologies include inconsistent EMR documentation, alert fatigue, and the potential for bias. Clinical informatics and machine learning models are promising tools for improving HIV care continuum outcomes. Future research should focus on methods for combining EMR data with additional data sources (e.g., social media, geospatial data) and studying how to effectively implement predictive models for HIV care into clinical practice.
Collapse
|