1
|
Chen JS, Reddy AJ, Al-Sharif E, Shoji MK, Kalaw FGP, Eslani M, Lang PZ, Arya M, Koretz ZA, Bolo KA, Arnett JJ, Roginiel AC, Do JL, Robbins SL, Camp AS, Scott NL, Rudell JC, Weinreb RN, Baxter SL, Granet DB. Analysis of ChatGPT Responses to Ophthalmic Cases: Can ChatGPT Think like an Ophthalmologist? OPHTHALMOLOGY SCIENCE 2025; 5:100600. [PMID: 39346575 PMCID: PMC11437840 DOI: 10.1016/j.xops.2024.100600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 08/09/2024] [Accepted: 08/13/2024] [Indexed: 10/01/2024]
Abstract
Objective Large language models such as ChatGPT have demonstrated significant potential in question-answering within ophthalmology, but there is a paucity of literature evaluating its ability to generate clinical assessments and discussions. The objectives of this study were to (1) assess the accuracy of assessment and plans generated by ChatGPT and (2) evaluate ophthalmologists' abilities to distinguish between responses generated by clinicians versus ChatGPT. Design Cross-sectional mixed-methods study. Subjects Sixteen ophthalmologists from a single academic center, of which 10 were board-eligible and 6 were board-certified, were recruited to participate in this study. Methods Prompt engineering was used to ensure ChatGPT output discussions in the style of the ophthalmologist author of the Medical College of Wisconsin Ophthalmic Case Studies. Cases where ChatGPT accurately identified the primary diagnoses were included and then paired. Masked human-generated and ChatGPT-generated discussions were sent to participating ophthalmologists to identify the author of the discussions. Response confidence was assessed using a 5-point Likert scale score, and subjective feedback was manually reviewed. Main Outcome Measures Accuracy of ophthalmologist identification of discussion author, as well as subjective perceptions of human-generated versus ChatGPT-generated discussions. Results Overall, ChatGPT correctly identified the primary diagnosis in 15 of 17 (88.2%) cases. Two cases were excluded from the paired comparison due to hallucinations or fabrications of nonuser-provided data. Ophthalmologists correctly identified the author in 77.9% ± 26.6% of the 13 included cases, with a mean Likert scale confidence rating of 3.6 ± 1.0. No significant differences in performance or confidence were found between board-certified and board-eligible ophthalmologists. Subjectively, ophthalmologists found that discussions written by ChatGPT tended to have more generic responses, irrelevant information, hallucinated more frequently, and had distinct syntactic patterns (all P < 0.01). Conclusions Large language models have the potential to synthesize clinical data and generate ophthalmic discussions. While these findings have exciting implications for artificial intelligence-assisted health care delivery, more rigorous real-world evaluation of these models is necessary before clinical deployment. Financial Disclosures The author(s) have no proprietary or commercial interest in any materials discussed in this article.
Collapse
Affiliation(s)
- Jimmy S Chen
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Akshay J Reddy
- School of Medicine, California University of Science and Medicine, Colton, California
| | - Eman Al-Sharif
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
- Surgery Department, College of Medicine, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Marissa K Shoji
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Fritz Gerald P Kalaw
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Medi Eslani
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Paul Z Lang
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Malvika Arya
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Zachary A Koretz
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Kyle A Bolo
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Justin J Arnett
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Aliya C Roginiel
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Jiun L Do
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Shira L Robbins
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Andrew S Camp
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Nathan L Scott
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Jolene C Rudell
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| | - Robert N Weinreb
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Sally L Baxter
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - David B Granet
- Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
| |
Collapse
|
2
|
Harrigian K, Tran D, Tang T, Gonzales A, Nagy P, Kharrazi H, Dredze M, Cai CX. Improving the Identification of Diabetic Retinopathy and Related Conditions in the Electronic Health Record Using Natural Language Processing Methods. OPHTHALMOLOGY SCIENCE 2024; 4:100578. [PMID: 39253550 PMCID: PMC11382176 DOI: 10.1016/j.xops.2024.100578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 07/04/2024] [Accepted: 07/12/2024] [Indexed: 09/11/2024]
Abstract
Purpose To compare the performance of 3 phenotyping methods in identifying diabetic retinopathy (DR) and related clinical conditions. Design Three phenotyping methods were used to identify clinical conditions including unspecified DR, nonproliferative DR (NPDR) (mild, moderate, severe), consolidated NPDR (unspecified DR or any NPDR), proliferative DR, diabetic macular edema (DME), vitreous hemorrhage, retinal detachment (RD) (tractional RD or combined tractional and rhegmatogenous RD), and neovascular glaucoma (NVG). The first method used only International Classification of Diseases, 10th Revision (ICD-10) diagnosis codes (ICD-10 Lookup System). The next 2 methods used a Bidirectional Encoder Representations from Transformers with a dense Multilayer Perceptron output layer natural language processing (NLP) framework. The NLP framework was applied either to free-text of provider notes (Text-Only NLP System) or both free-text and ICD-10 diagnosis codes (Text-and-International Classification of Diseases [ICD] NLP System). Subjects Adults ≥18 years with diabetes mellitus seen at the Wilmer Eye Institute. Methods We compared the performance of the 3 phenotyping methods in identifying the DR related conditions with gold standard chart review. We also compared the estimated disease prevalence using each method. Main Outcome Measures Performance of each method was reported as the macro F1 score. The agreement between the methods was calculated using the kappa statistic. Prevalence estimates were also calculated for each method. Results A total of 91 097 patients and 692 486 office visits were included in the study. Compared with the gold standard, the Text-and-ICD NLP System had the highest F1 score for most clinical conditions (range 0.39-0.64). The agreement between the ICD-10 Lookup System and Text-Only NLP System varied (kappa of 0.21-0.81). The prevalence of DR and related conditions ranged from 1.1% for NVG to 17.9% for DME (using the Text-and-ICD NLP System). Conclusions The prevalence of DR and related conditions varied significantly depending on the methodology of identifying cases. The best performing phenotyping method was the Text-and-ICD NLP System that used information in both diagnosis codes as well as free-text notes. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Keith Harrigian
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland
| | - Diep Tran
- Wilmer Eye Institute, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Tina Tang
- Wilmer Eye Institute, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Anthony Gonzales
- Wilmer Eye Institute, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Paul Nagy
- Department of Biomedical Informatics and Data Science, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Hadi Kharrazi
- Center for Population Health Information Technology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| | - Mark Dredze
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland
| | - Cindy X Cai
- Wilmer Eye Institute, Johns Hopkins School of Medicine, Baltimore, Maryland
- Department of Biomedical Informatics and Data Science, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
3
|
Mora S, Giacobbe DR, Bartalucci C, Viglietti G, Mikulska M, Vena A, Ball L, Robba C, Cappello A, Battaglini D, Brunetti I, Pelosi P, Bassetti M, Giacomini M. Towards the automatic calculation of the EQUAL Candida Score: Extraction of CVC-related information from EMRs of critically ill patients with candidemia in Intensive Care Units. J Biomed Inform 2024; 156:104667. [PMID: 38848885 DOI: 10.1016/j.jbi.2024.104667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 06/01/2024] [Accepted: 06/03/2024] [Indexed: 06/09/2024]
Abstract
OBJECTIVES Candidemia is the most frequent invasive fungal disease and the fourth most frequent bloodstream infection in hospitalized patients. Its optimal management is crucial for improving patients' survival. The quality of candidemia management can be assessed with the EQUAL Candida Score. The objective of this work is to support its automatic calculation by extracting central venous catheter-related information from Italian text in clinical notes of electronic medical records. MATERIALS AND METHODS The sample includes 4,787 clinical notes of 108 patients hospitalized between January 2018 to December 2020 in the Intensive Care Units of the IRCCS San Martino Polyclinic Hospital in Genoa (Italy). The devised pipeline exploits natural language processing (NLP) to produce numerical representations of clinical notes used as input of machine learning (ML) algorithms to identify CVC presence and removal. It compares the performances of (i) rule-based method, (ii) count-based method together with a ML algorithm, and (iii) a transformers-based model. RESULTS Results, obtained with three different approaches, were evaluated in terms of weighted F1 Score. The random forest classifier showed the higher performance in both tasks reaching 82.35%. CONCLUSION The present work constitutes a first step towards the automatic calculation of the EQUAL Candida Score from unstructured daily collected data by combining ML and NLP methods. The automatic calculation of the EQUAL Candida Score could provide crucial real-time feedback on the quality of candidemia management, aimed at further improving patients' health.
Collapse
Affiliation(s)
- Sara Mora
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy; UO Information and Communication Technologies (ICT), IRCCS Ospedale Policlinico San Martino, Genoa, Italy.
| | - Daniele Roberto Giacobbe
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Claudia Bartalucci
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Giulia Viglietti
- Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Malgorzata Mikulska
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Antonio Vena
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Lorenzo Ball
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Chiara Robba
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Alice Cappello
- Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Denise Battaglini
- Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Iole Brunetti
- Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Paolo Pelosi
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Matteo Bassetti
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Mauro Giacomini
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy
| |
Collapse
|
4
|
Stein JD, Zhou Y, Andrews CA, Kim JE, Addis V, Bixler J, Grove N, McMillan B, Munir SZ, Pershing S, Schultz JS, Stagg BC, Wang SY, Woreta F. Using Natural Language Processing to Identify Different Lens Pathology in Electronic Health Records. Am J Ophthalmol 2024; 262:153-160. [PMID: 38296152 PMCID: PMC11098689 DOI: 10.1016/j.ajo.2024.01.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/21/2024] [Accepted: 01/22/2024] [Indexed: 05/18/2024]
Abstract
PURPOSE Nearly all published ophthalmology-related Big Data studies rely exclusively on International Classification of Diseases (ICD) billing codes to identify patients with particular ocular conditions. However, inaccurate or nonspecific codes may be used. We assessed whether natural language processing (NLP), as an alternative approach, could more accurately identify lens pathology. DESIGN Database study comparing the accuracy of NLP versus ICD billing codes to properly identify lens pathology. METHODS We developed an NLP algorithm capable of searching free-text lens exam data in the electronic health record (EHR) to identify the type(s) of cataract present, cataract density, presence of intraocular lenses, and other lens pathology. We applied our algorithm to 17.5 million lens exam records in the Sight Outcomes Research Collaborative (SOURCE) repository. We selected 4314 unique lens-exam entries and asked 11 clinicians to assess whether all pathology present in the entries had been correctly identified in the NLP algorithm output. The algorithm's sensitivity at accurately identifying lens pathology was compared with that of the ICD codes. RESULTS The NLP algorithm correctly identified all lens pathology present in 4104 of the 4314 lens-exam entries (95.1%). For less common lens pathology, algorithm findings were corroborated by reviewing clinicians for 100% of mentions of pseudoexfoliation material and 99.7% for phimosis, subluxation, and synechia. Sensitivity at identifying lens pathology was better for NLP (0.98 [0.96-0.99] than for billing codes (0.49 [0.46-0.53]). CONCLUSIONS Our NLP algorithm identifies and classifies lens abnormalities routinely documented by eye-care professionals with high accuracy. Such algorithms will help researchers to properly identify and classify ocular pathology, broadening the scope of feasible research using real-world data.
Collapse
Affiliation(s)
- Joshua D Stein
- From the W.K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan, USA (J.D.S., Y.Z., C.A.A., J.B.); Department of Health Management and Policy, University of Michigan School of Public Health, Ann Arbor, Michigan, USA (J.D.S.).
| | - Yunshu Zhou
- From the W.K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan, USA (J.D.S., Y.Z., C.A.A., J.B.)
| | - Chris A Andrews
- From the W.K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan, USA (J.D.S., Y.Z., C.A.A., J.B.)
| | - Judy E Kim
- Department of Ophthalmology and Visual Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin, USA (J.E.K.)
| | - Victoria Addis
- Department of Ophthalmology, University of Pennsylvania, Philadelphia, Pennsylvania, USA (V.A.)
| | - Jill Bixler
- From the W.K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan, USA (J.D.S., Y.Z., C.A.A., J.B.)
| | - Nathan Grove
- Department of Ophthalmology, University of Colorado School of Medicine, Aurora, Colorado, USA (N.G.)
| | - Brian McMillan
- Department of Ophthalmology and Visual Sciences, West Virginia University, Morgantown, West Virginia, USA (B.M.)
| | - Saleha Z Munir
- Department of Ophthalmology and Visual Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA (S.Z.M.)
| | - Suzann Pershing
- Byers Eye Institute at Stanford, Department of Ophthalmology, Stanford University, Stanford, California, USA (S.P., S.Y.W.); VA Palo Alto Health Care System, Palo Alto, California, USA (S.P.)
| | - Jeffrey S Schultz
- Department of Ophthalmology, Montefiore Medical Center, New York, New York, USA (J.S.S.)
| | - Brian C Stagg
- Department of Ophthalmology, University of Utah, Salt Lake City, Utah, USA (B.C.S.)
| | - Sophia Y Wang
- Byers Eye Institute at Stanford, Department of Ophthalmology, Stanford University, Stanford, California, USA (S.P., S.Y.W.)
| | - Fasika Woreta
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA (F.W.)
| |
Collapse
|
5
|
Tavakoli K, Kalaw FGP, Bhanvadia S, Hogarth M, Baxter SL. Concept Coverage Analysis of Ophthalmic Infections and Trauma among the Standardized Medical Terminologies SNOMED-CT, ICD-10-CM, and ICD-11. OPHTHALMOLOGY SCIENCE 2023; 3:100337. [PMID: 37449050 PMCID: PMC10336190 DOI: 10.1016/j.xops.2023.100337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 05/10/2023] [Accepted: 05/19/2023] [Indexed: 07/18/2023]
Abstract
Purpose Widespread electronic health record adoption has generated a large volume of data and emphasized the need for standardized terminology to describe clinical concepts. Here, we undertook a systematic concept coverage analysis to determine the representation of clinical concepts in ophthalmic infection and ophthalmic trauma among standardized medical terminologies, including the Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT), the International Classification of Diseases (ICD) version 10 with clinical modifications (ICD-10-CM), and ICD version 11 (ICD-11). Design Extraction of concepts related to ophthalmic infection and ophthalmic trauma and structured search in terminology browsers. Data Sources The American Academy of Ophthalmology Basic and Clinical Science Course (BCSC), SNOMED-CT, and ICD-10-CM terminologies from the Observational Health Data Sciences and Informatics Athena browser, and the ICD-11 terminology browser. Methods Concepts pertaining to ophthalmic infection and ophthalmic trauma were extracted from the 2022 BCSC free text and index terms. We searched terminology browsers to identify corresponding codes and classified the extent of semantic alignment as equal, wide, narrow, or unmatched in each terminology. The overlap of equal concepts in each terminology was represented in a Venn diagram. Main Outcome Measures Proportions of clinical concepts with corresponding codes at various levels of semantic alignment. Results A total of 443 concepts were identified: 304 concepts related to ophthalmic infection and 139 concepts related to ophthalmic trauma. The SNOMED-CT had the highest proportion of equal coverage, with 82.0% (249 of 304) among concepts related to ophthalmic infection and 82.0% (115 of 139) among concepts related to ophthalmic trauma. Across all concepts, 28% (124 of 443) were classified as equal in ICD-10-CM and 52.8% (234 of 443) were classified as equal in ICD-11. Conclusions The SNOMED-CT had significantly better semantic alignment than ICD-10-CM and ICD-11 for ophthalmic infections and ophthalmic trauma. This demonstrates opportunity for continuing advancement of representation of ophthalmic concepts in standardized medical terminologies.
Collapse
Affiliation(s)
- Kiana Tavakoli
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Fritz Gerald P. Kalaw
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Sonali Bhanvadia
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Michael Hogarth
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Sally L. Baxter
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| |
Collapse
|
6
|
Chen JS, Lin WC, Yang S, Chiang MF, Hribar MR. Development of an Open-Source Annotated Glaucoma Medication Dataset From Clinical Notes in the Electronic Health Record. Transl Vis Sci Technol 2022; 11:20. [PMID: 36441131 PMCID: PMC9710490 DOI: 10.1167/tvst.11.11.20] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 10/21/2022] [Indexed: 11/30/2022] Open
Abstract
Purpose To describe the methods involved in processing and characteristics of an open dataset of annotated clinical notes from the electronic health record (EHR) annotated for glaucoma medications. Methods In this study, 480 clinical notes from office visits, medical record numbers (MRNs), visit identification numbers, provider names, and billing codes were extracted for 480 patients seen for glaucoma by a comprehensive or glaucoma ophthalmologist from January 1, 2019, to August 31, 2020. MRNs and all visit data were de-identified using a hash function with salt from the deidentifyr package. All progress notes were annotated for glaucoma medication name, route, frequency, dosage, and drug use using an open-source annotation tool, Doccano. Annotations were saved separately. All protected health information (PHI) in progress notes and annotated files were de-identified using the published de-identifying algorithm Philter. All progress notes and annotations were manually validated by two ophthalmologists to ensure complete de-identification. Results The final dataset contained 5520 annotated sentences, including those with and without medications, for 480 clinical notes. Manual validation revealed 10 instances of remaining PHI which were manually corrected. Conclusions Annotated free-text clinical notes can be de-identified for upload as an open dataset. As data availability increases with the adoption of EHRs, free-text open datasets will become increasingly valuable for "big data" research and artificial intelligence development. This dataset is published online and publicly available at https://github.com/jche253/Glaucoma_Med_Dataset. Translational Relevance This open access medication dataset may be a source of raw data for future research involving big data and artificial intelligence research using free-text.
Collapse
Affiliation(s)
- Jimmy S. Chen
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, CA, USA
| | - Wei-Chun Lin
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA
| | - Sen Yang
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA
| | - Michael F. Chiang
- National Eye Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michelle R. Hribar
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
7
|
Chen JS, Baxter SL. Applications of natural language processing in ophthalmology: present and future. Front Med (Lausanne) 2022; 9:906554. [PMID: 36004369 PMCID: PMC9393550 DOI: 10.3389/fmed.2022.906554] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 05/31/2022] [Indexed: 11/13/2022] Open
Abstract
Advances in technology, including novel ophthalmic imaging devices and adoption of the electronic health record (EHR), have resulted in significantly increased data available for both clinical use and research in ophthalmology. While artificial intelligence (AI) algorithms have the potential to utilize these data to transform clinical care, current applications of AI in ophthalmology have focused mostly on image-based deep learning. Unstructured free-text in the EHR represents a tremendous amount of underutilized data in big data analyses and predictive AI. Natural language processing (NLP) is a type of AI involved in processing human language that can be used to develop automated algorithms using these vast quantities of available text data. The purpose of this review was to introduce ophthalmologists to NLP by (1) reviewing current applications of NLP in ophthalmology and (2) exploring potential applications of NLP. We reviewed current literature published in Pubmed and Google Scholar for articles related to NLP and ophthalmology, and used ancestor search to expand our references. Overall, we found 19 published studies of NLP in ophthalmology. The majority of these publications (16) focused on extracting specific text such as visual acuity from free-text notes for the purposes of quantitative analysis. Other applications included: domain embedding, predictive modeling, and topic modeling. Future ophthalmic applications of NLP may also focus on developing search engines for data within free-text notes, cleaning notes, automated question-answering, and translating ophthalmology notes for other specialties or for patients, especially with a growing interest in open notes. As medicine becomes more data-oriented, NLP offers increasing opportunities to augment our ability to harness free-text data and drive innovations in healthcare delivery and treatment of ophthalmic conditions.
Collapse
Affiliation(s)
- Jimmy S. Chen
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, CA, United States
- Health Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, United States
| | - Sally L. Baxter
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, CA, United States
- Health Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, United States
| |
Collapse
|