1
|
Fonferko-Shadrach B, Strafford H, Jones C, Khan RA, Brown S, Edwards J, Hawken J, Shrimpton LE, White CP, Powell R, Sawhney IMS, Pickrell WO, Lacey AS. Annotation of epilepsy clinic letters for natural language processing. J Biomed Semantics 2024; 15:17. [PMID: 39277770 PMCID: PMC11402197 DOI: 10.1186/s13326-024-00316-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 07/22/2024] [Indexed: 09/17/2024] Open
Abstract
BACKGROUND Natural language processing (NLP) is increasingly being used to extract structured information from unstructured text to assist clinical decision-making and aid healthcare research. The availability of expert-annotated documents for the development and validation of NLP applications is limited. We created synthetic clinical documents to address this, and to validate the Extraction of Epilepsy Clinical Text version 2 (ExECTv2) NLP pipeline. METHODS We created 200 synthetic clinic letters based on hospital outpatient consultations with epilepsy specialists. The letters were double annotated by trained clinicians and researchers according to agreed guidelines. We used the annotation tool, Markup, with an epilepsy concept list based on the Unified Medical Language System ontology. All annotations were reviewed, and a gold standard set of annotations was agreed and used to validate the performance of ExECTv2. RESULTS The overall inter-annotator agreement (IAA) between the two sets of annotations produced a per item F1 score of 0.73. Validating ExECTv2 using the gold standard gave an overall F1 score of 0.87 per item, and 0.90 per letter. CONCLUSION The synthetic letters, annotations, and annotation guidelines have been made freely available. To our knowledge, this is the first publicly available set of annotated epilepsy clinic letters and guidelines that can be used for NLP researchers with minimum epilepsy knowledge. The IAA results show that clinical text annotation tasks are difficult and require a gold standard to be arranged by researcher consensus. The results for ExECTv2, our automated epilepsy NLP pipeline, extracted detailed epilepsy information from unstructured epilepsy letters with more accuracy than human annotators, further confirming the utility of NLP for clinical and research applications.
Collapse
Affiliation(s)
| | - Huw Strafford
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
| | - Carys Jones
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
| | - Russell A Khan
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
| | - Sharon Brown
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Jenny Edwards
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Jonathan Hawken
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Luke E Shrimpton
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Catharine P White
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
- Paediatric Neurology Centre, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Robert Powell
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Inder M S Sawhney
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - William O Pickrell
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
- Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK
| | - Arron S Lacey
- Swansea University Medical School, Swansea University, Swansea, Wales, UK
| |
Collapse
|
2
|
Fernandes M, Cardall A, Moura LM, McGraw C, Zafar SF, Westover MB. Extracting seizure control metrics from clinic notes of patients with epilepsy: A natural language processing approach. Epilepsy Res 2024; 207:107451. [PMID: 39276641 DOI: 10.1016/j.eplepsyres.2024.107451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 07/17/2024] [Accepted: 09/09/2024] [Indexed: 09/17/2024]
Abstract
OBJECTIVES Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP). METHODS We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure ("today", "1-6 days ago", "1-4 weeks ago", "more than 1-3 months ago", "more than 3-6 months ago", "more than 6-12 months ago", "more than 1-2 years ago", "more than 2 years ago") and seizure frequency ("innumerable", "multiple", "daily", "weekly", "monthly", "once per year", "less than once per year"). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95 % confidence intervals (CI) estimated via bootstrapping. RESULTS Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57 %), White (81 %) and non-Hispanic (85 %). The models achieved an MAE (95 % CI) for date of last seizure of 4 (4.00-4.86) weeks, and for seizure frequency of 0.02 (0.02-0.02) seizures per day. CONCLUSIONS Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.
Collapse
Affiliation(s)
- Marta Fernandes
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - Aidan Cardall
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States
| | - Lidia Mvr Moura
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States
| | - Christopher McGraw
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States
| | - Sahar F Zafar
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States
| | - M Brandon Westover
- Harvard Medical School, Boston, MA, United States; Beth Israel Deaconess Medical Center (BIDMC), Boston, MA, United States
| |
Collapse
|
3
|
Galer PD, Parthasarathy S, Xian J, McKee JL, Ruggiero SM, Ganesan S, Kaufman MC, Cohen SR, Haag S, Chen C, Ojemann WKS, Kim D, Wilmarth O, Vaidiswaran P, Sederman C, Ellis CA, Gonzalez AK, Boßelmann CM, Lal D, Sederman R, Lewis-Smith D, Litt B, Helbig I. Clinical signatures of genetic epilepsies precede diagnosis in electronic medical records of 32,000 individuals. Genet Med 2024; 26:101211. [PMID: 39011766 DOI: 10.1016/j.gim.2024.101211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 07/10/2024] [Accepted: 07/10/2024] [Indexed: 07/17/2024] Open
Abstract
PURPOSE An early genetic diagnosis can guide the time-sensitive treatment of individuals with genetic epilepsies. However, most genetic diagnoses occur long after disease onset. We aimed to identify early clinical features suggestive of genetic diagnoses in individuals with epilepsy through large-scale analysis of full-text electronic medical records. METHODS We extracted 89 million time-stamped standardized clinical annotations using Natural Language Processing from 4,572,783 clinical notes from 32,112 individuals with childhood epilepsy, including 1925 individuals with known or presumed genetic epilepsies. We applied these features to train random forest models to predict SCN1A-related disorders and any genetic diagnosis. RESULTS We identified 47,774 age-dependent associations of clinical features with genetic etiologies a median of 3.6 years before molecular diagnosis. Across all 710 genetic etiologies identified in our cohort, neurodevelopmental differences between 6 to 9 months increased the likelihood of a later molecular diagnosis 5-fold (P < .0001, 95% CI = 3.55-7.42). A later diagnosis of SCN1A-related disorders (area under the curve [AUC] = 0.91) or an overall positive genetic diagnosis (AUC = 0.82) could be reliably predicted using random forest models. CONCLUSION Clinical features predictive of genetic epilepsies precede molecular diagnoses by up to several years in conditions with known precision treatments. An earlier diagnosis facilitated by automated electronic medical records analysis has the potential for earlier targeted therapeutic strategies in the genetic epilepsies.
Collapse
Affiliation(s)
- Peter D Galer
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA; University of Pennsylvania, Center for Neuroengineering and Therapeutics, Philadelphia, PA
| | - Shridhar Parthasarathy
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Julie Xian
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Jillian L McKee
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA; Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Sarah M Ruggiero
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Shiva Ganesan
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Michael C Kaufman
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Stacey R Cohen
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Scott Haag
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA
| | | | - William K S Ojemann
- University of Pennsylvania, Center for Neuroengineering and Therapeutics, Philadelphia, PA
| | | | - Olivia Wilmarth
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Priya Vaidiswaran
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Casey Sederman
- Department of Human Genetics, University of Utah, Salt Lake City, UT; Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, UT
| | - Colin A Ellis
- The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA; Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Alexander K Gonzalez
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA
| | - Christian M Boßelmann
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH
| | - Dennis Lal
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH; Cologne Center for Genomics (CCG), University of Cologne, Cologne, Germany
| | | | - David Lewis-Smith
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; Translational and Clinical Research Institute, Newcastle University, Newcastle-upon-Tyne, UK; Newcastle Upon Tyne Hospitals NHS Foundation Trust, Newcastle-upon-Tyne, UK; FutureNeuro SFI Research Centre, RCSI University of Medicine and Health Sciences, Dublin 2, Ireland
| | - Brian Litt
- University of Pennsylvania, Center for Neuroengineering and Therapeutics, Philadelphia, PA; Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Ingo Helbig
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA; Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA.
| |
Collapse
|
4
|
van Diessen E, van Amerongen RA, Zijlmans M, Otte WM. Potential merits and flaws of large language models in epilepsy care: A critical review. Epilepsia 2024; 65:873-886. [PMID: 38305763 DOI: 10.1111/epi.17907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/30/2023] [Accepted: 01/19/2024] [Indexed: 02/03/2024]
Abstract
The current pace of development and applications of large language models (LLMs) is unprecedented and will impact future medical care significantly. In this critical review, we provide the background to better understand these novel artificial intelligence (AI) models and how LLMs can be of future use in the daily care of people with epilepsy. Considering the importance of clinical history taking in diagnosing and monitoring epilepsy-combined with the established use of electronic health records-a great potential exists to integrate LLMs in epilepsy care. We present the current available LLM studies in epilepsy. Furthermore, we highlight and compare the most commonly used LLMs and elaborate on how these models can be applied in epilepsy. We further discuss important drawbacks and risks of LLMs, and we provide recommendations for overcoming these limitations.
Collapse
Affiliation(s)
- Eric van Diessen
- Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Department of Pediatrics, Franciscus Gasthuis & Vlietland, Rotterdam, The Netherlands
| | - Ramon A van Amerongen
- Faculty of Science, Bioinformatics and Biocomplexity, Utrecht University, Utrecht, The Netherlands
| | - Maeike Zijlmans
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Stichting Epilepsie Instellingen Nederland, Heemstede, The Netherlands
| | - Willem M Otte
- Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
5
|
Tsai AY, Carter SR, Greene AC. Artificial intelligence in pediatric surgery. Semin Pediatr Surg 2024; 33:151390. [PMID: 38242061 DOI: 10.1016/j.sempedsurg.2024.151390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2024]
Abstract
Artificial intelligence (AI) is rapidly changing the landscape of medicine and is already being utilized in conjunction with medical diagnostics and imaging analysis. We hereby explore AI applications in surgery and examine its relevance to pediatric surgery, covering its evolution, current state, and promising future. The various fields of AI are explored including machine learning and applications to predictive analytics and decision support in surgery, computer vision and image analysis in preoperative planning, image segmentation, surgical navigation, and finally, natural language processing assist in expediting clinical documentation, identification of clinical indications, quality improvement, outcome research, and other types of automated data extraction. The purpose of this review is to familiarize the pediatric surgical community with the rise of AI and highlight the ongoing advancements and challenges in its adoption, including data privacy, regulatory considerations, and the imperative for interdisciplinary collaboration. We hope this review serves as a comprehensive guide to AI's transformative influence on surgery, demonstrating its potential to enhance pediatric surgical patient outcomes, improve precision, and usher in a new era of surgical excellence.
Collapse
Affiliation(s)
- Anthony Y Tsai
- Division of Pediatric Surgery, Penn State Health Children's Hospital, 500 University Drive, Hershey, PA 17033, United States.
| | - Stewart R Carter
- Division of Pediatric Surgery, University of Louisville School of Medicine, Louisville, KY, United States
| | - Alicia C Greene
- Division of Pediatric Surgery, Penn State Health Children's Hospital, 500 University Drive, Hershey, PA 17033, United States
| |
Collapse
|
6
|
Mora S, Turrisi R, Chiarella L, Consales A, Tassi L, Mai R, Nobili L, Barla A, Arnulfo G. NLP-based tools for localization of the epileptogenic zone in patients with drug-resistant focal epilepsy. Sci Rep 2024; 14:2349. [PMID: 38287042 PMCID: PMC10825198 DOI: 10.1038/s41598-024-51846-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 01/10/2024] [Indexed: 01/31/2024] Open
Abstract
Epilepsy surgery is an option for people with focal onset drug-resistant (DR) seizures but a delayed or incorrect diagnosis of epileptogenic zone (EZ) location limits its efficacy. Seizure semiological manifestations and their chronological appearance contain valuable information on the putative EZ location but their interpretation relies on extensive experience. The aim of our work is to support the localization of EZ in DR patients automatically analyzing the semiological description of seizures contained in video-EEG reports. Our sample is composed of 536 descriptions of seizures extracted from Electronic Medical Records of 122 patients. We devised numerical representations of anamnestic records and seizures descriptions, exploiting Natural Language Processing (NLP) techniques, and used them to feed Machine Learning (ML) models. We performed three binary classification tasks: localizing the EZ in the right or left hemisphere, temporal or extra-temporal, and frontal or posterior regions. Our computational pipeline reached performances above 70% in all tasks. These results show that NLP-based numerical representation combined with ML-based classification models may help in localizing the origin of the seizures relying only on seizures-related semiological text data alone. Accurate early recognition of EZ could enable a more appropriate patient management and a faster access to epilepsy surgery to potential candidates.
Collapse
Affiliation(s)
- Sara Mora
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy.
| | - Rosanna Turrisi
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy
- MaLGa Machine Learning Genoa Center, University of Genoa, 16146, Genoa, Italy
| | - Lorenzo Chiarella
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Child and Maternal Health (DINOGMI), University of Genoa, 16132, Genoa, Italy
- Child Neuropsychiatry Unit, IRCCS Istituto Giannina Gaslini, Member of the European Reference Network EpiCARE, 16147, Genoa, Italy
| | - Alessandro Consales
- Division of Neurosurgery, IRCCS Istituto Giannina Gaslini, 16147, Genoa, Italy
| | - Laura Tassi
- "Claudio Munari" Epilepsy Surgery Center, Niguarda Hospital, 20162, Milan, Italy
| | - Roberto Mai
- "Claudio Munari" Epilepsy Surgery Center, Niguarda Hospital, 20162, Milan, Italy
| | - Lino Nobili
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Child and Maternal Health (DINOGMI), University of Genoa, 16132, Genoa, Italy
- Child Neuropsychiatry Unit, IRCCS Istituto Giannina Gaslini, Member of the European Reference Network EpiCARE, 16147, Genoa, Italy
| | - Annalisa Barla
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy
- MaLGa Machine Learning Genoa Center, University of Genoa, 16146, Genoa, Italy
| | - Gabriele Arnulfo
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy
- Neuroscience Center, Helsinki Institute of Life Science (HiLife), University of Helsinki, 00014, Helsinki, Finland
| |
Collapse
|
7
|
Msosa YJ, Grauslys A, Zhou Y, Wang T, Buchan I, Langan P, Foster S, Walker M, Pearson M, Folarin A, Roberts A, Maskell S, Dobson R, Kullu C, Kehoe D. Trustworthy Data and AI Environments for Clinical Prediction: Application to Crisis-Risk in People With Depression. IEEE J Biomed Health Inform 2023; 27:5588-5598. [PMID: 37669205 DOI: 10.1109/jbhi.2023.3312011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Depression is a common mental health condition that often occurs in association with other chronic illnesses, and varies considerably in severity. Electronic Health Records (EHRs) contain rich information about a patient's medical history and can be used to train, test and maintain predictive models to support and improve patient care. This work evaluated the feasibility of implementing an environment for predicting mental health crisis among people living with depression based on both structured and unstructured EHRs. A large EHR from a mental health provider, Mersey Care, was pseudonymised and ingested into the Natural Language Processing (NLP) platform CogStack, allowing text content in binary clinical notes to be extracted. All unstructured clinical notes and summaries were semantically annotated by MedCAT and BioYODIE NLP services. Cases of crisis in patients with depression were then identified. Random forest models, gradient boosting trees, and Long Short-Term Memory (LSTM) networks, with varying feature arrangement, were trained to predict the occurrence of crisis. The results showed that all the prediction models can use a combination of structured and unstructured EHR information to predict crisis in patients with depression with good and useful accuracy. The LSTM network that was trained on a modified dataset with only 1000 most-important features from the random forest model with temporality showed the best performance with a mean AUC of 0.901 and a standard deviation of 0.006 using a training dataset and a mean AUC of 0.810 and 0.01 using a hold-out test dataset. Comparing the results from the technical evaluation with the views of psychiatrists shows that there are now opportunities to refine and integrate such prediction models into pragmatic point-of-care clinical decision support tools for supporting mental healthcare delivery.
Collapse
|
8
|
Vulpius SA, Werge S, Jørgensen IF, Siggaard T, Hernansanz Biel J, Knudsen GM, Brunak S, Pinborg LH. Text mining of electronic health records can validate a register-based diagnosis of epilepsy and subgroup into focal and generalized epilepsy. Epilepsia 2023; 64:2750-2760. [PMID: 37548470 DOI: 10.1111/epi.17734] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/01/2023] [Accepted: 08/01/2023] [Indexed: 08/08/2023]
Abstract
OBJECTIVE Combining population-based health registries and electronic health records offers the opportunity to create large, phenotypically detailed patient cohorts of high quality. In this study, we used text mining of clinical notes to confirm International Classification of Diseases, 10th Revision (ICD-10)-registered epilepsy diagnoses and classify patients according to focal and generalized epilepsy types. METHODS Using the Danish National Patient Registry, we identified patients who between 2006 and 2016 received an ICD-10 diagnosis of epilepsy. To validate the epilepsy diagnosis and stratify patients into focal and generalized epilepsy types, we constructed dictionaries for text mining-based extraction of clinical notes. Two physicians manually reviewed the clinical notes for a total of 527 patients and assigned epilepsy diagnoses, which were compared with the text-mined diagnoses. RESULTS We identified 23 632 patients with an ICD-10 diagnosis of epilepsy, of whom 50% were registered with an unspecified epilepsy diagnosis. In total, 11 211 patients were considered likely to have epilepsy by text mining, with an F1 measure ranging from 82% to 90%. Manual review of the electronic health records for 310 patients revealed a false discovery rate of 29%. This rate was decreased to 4% by the text mining algorithm. The weighted average F1 measure for text mining-assigned epilepsy types was 79% (82% for focal and 76% for generalized epilepsy). Text mining successfully assigned a focal or generalized epilepsy type to 92% of the text mining-eligible patients registered with unspecified epilepsy. SIGNIFICANCE Text mining of electronic health records can be used to establish a patient cohort with much higher likelihood of having a diagnosis of epilepsy and a focal or generalized epilepsy type compared to the cohort created from ICD-10 epilepsy codes alone. We believe the concept will be essential for future genome-wide and phenome-wide association studies and subsequently the development of precision medicine for epilepsy patients.
Collapse
Affiliation(s)
- Siri A Vulpius
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Sebastian Werge
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Isabella Friis Jørgensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Troels Siggaard
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Jorge Hernansanz Biel
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Gitte M Knudsen
- Epilepsy Clinic and Neurobiology Research Unit, University Hospital Rigshospitalet, Copenhagen, Denmark
- Institute for Clinical Medicine, Faculty of Health and Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Lars H Pinborg
- Epilepsy Clinic and Neurobiology Research Unit, University Hospital Rigshospitalet, Copenhagen, Denmark
- Institute for Clinical Medicine, Faculty of Health and Medicine, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
9
|
Bosch D, Kuppen MCP, Tascilar M, Smilde TJ, Mulders PFA, Uyl-de Groot CA, van Oort IM. Reliability and Efficiency of the CAPRI-3 Metastatic Prostate Cancer Registry Driven by Artificial Intelligence. Cancers (Basel) 2023; 15:3808. [PMID: 37568624 PMCID: PMC10417512 DOI: 10.3390/cancers15153808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/19/2023] [Accepted: 07/23/2023] [Indexed: 08/13/2023] Open
Abstract
BACKGROUND Manual data collection is still the gold standard for disease-specific patient registries. However, CAPRI-3 uses text mining (an artificial intelligence (AI) technology) for patient identification and data collection. The aim of this study is to demonstrate the reliability and efficiency of this AI-driven approach. METHODS CAPRI-3 is an observational retrospective multicenter cohort registry on metastatic prostate cancer. We tested the patient-identification algorithm and automated data extraction through manual validation of the same patients in two pilots in 2019 and 2022. RESULTS Pilot one identified 2030 patients and pilot two 9464 patients. The negative predictive value of the algorithm was maximized to prevent false exclusions and reached 94.8%. The completeness and accuracy of the automated data extraction were 92.3% or higher, except for date fields and inaccessible data (images/pdf) (10-88.9%). Additional manual quality control took over 3 h less time per patient than the original fully manual CAPRI registry (105 vs. 300 min). CONCLUSIONS The CAPRI-3 patient-identification algorithm is a sound replacement for excluding ineligible candidates. The AI-driven data extraction is largely accurate and complete, but manual quality control is needed for less reliable and inaccessible data. Overall, the AI-driven approach of the CAPRI-3 registry is reliable and timesaving.
Collapse
Affiliation(s)
- Dianne Bosch
- Department of Urology, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands (I.M.v.O.)
| | - Malou C. P. Kuppen
- Department of Radiotherapy, Maastro Clinic, 6229 ET Maastricht, The Netherlands
| | - Metin Tascilar
- Department of Medical Oncology, Isala Hospital, 8025 AB Zwolle, The Netherlands
| | - Tineke J. Smilde
- Department of Medical Oncology, Jeroen Bosch Hospital, 5223 GZ ‘s-Hertogenbosch, The Netherlands;
| | - Peter F. A. Mulders
- Department of Urology, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands (I.M.v.O.)
| | - Carin A. Uyl-de Groot
- Erasmus School of Health Policy and Management, Erasmus University Rotterdam, 3062 PA Rotterdam, The Netherlands
| | - Inge M. van Oort
- Department of Urology, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands (I.M.v.O.)
| |
Collapse
|
10
|
Xie K, Gallagher RS, Shinohara RT, Xie SX, Hill CE, Conrad EC, Davis KA, Roth D, Litt B, Ellis CA. Long-term epilepsy outcome dynamics revealed by natural language processing of clinic notes. Epilepsia 2023; 64:1900-1909. [PMID: 37114472 PMCID: PMC10523917 DOI: 10.1111/epi.17633] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/26/2023] [Accepted: 04/26/2023] [Indexed: 04/29/2023]
Abstract
OBJECTIVE Electronic medical records allow for retrospective clinical research with large patient cohorts. However, epilepsy outcomes are often contained in free text notes that are difficult to mine. We recently developed and validated novel natural language processing (NLP) algorithms to automatically extract key epilepsy outcome measures from clinic notes. In this study, we assessed the feasibility of extracting these measures to study the natural history of epilepsy at our center. METHODS We applied our previously validated NLP algorithms to extract seizure freedom, seizure frequency, and date of most recent seizure from outpatient visits at our epilepsy center from 2010 to 2022. We examined the dynamics of seizure outcomes over time using Markov model-based probability and Kaplan-Meier analyses. RESULTS Performance of our algorithms on classifying seizure freedom was comparable to that of human reviewers (algorithm F1 = .88 vs. human annotatorκ = .86). We extracted seizure outcome data from 55 630 clinic notes from 9510 unique patients written by 53 unique authors. Of these, 30% were classified as seizure-free since the last visit, 48% of non-seizure-free visits contained a quantifiable seizure frequency, and 47% of all visits contained the date of most recent seizure occurrence. Among patients with at least five visits, the probabilities of seizure freedom at the next visit ranged from 12% to 80% in patients having seizures or seizure-free at the prior three visits, respectively. Only 25% of patients who were seizure-free for 6 months remained seizure-free after 10 years. SIGNIFICANCE Our findings demonstrate that epilepsy outcome measures can be extracted accurately from unstructured clinical note text using NLP. At our tertiary center, the disease course often followed a remitting and relapsing pattern. This method represents a powerful new tool for clinical research with many potential uses and extensions to other clinical questions.
Collapse
Affiliation(s)
- Kevin Xie
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Ryan S. Gallagher
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Russell T. Shinohara
- Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sharon X. Xie
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Chloe E. Hill
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Erin C. Conrad
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kathryn A. Davis
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Dan Roth
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Brian Litt
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Colin A. Ellis
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| |
Collapse
|
11
|
Fernandes M, Cardall A, Jing J, Ge W, Moura LMVR, Jacobs C, McGraw C, Zafar SF, Westover MB. Identification of patients with epilepsy using automated electronic health records phenotyping. Epilepsia 2023; 64:1472-1481. [PMID: 36934317 PMCID: PMC10239346 DOI: 10.1111/epi.17589] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 03/15/2023] [Accepted: 03/16/2023] [Indexed: 03/20/2023]
Abstract
OBJECTIVE Unstructured data present in electronic health records (EHR) are a rich source of medical information; however, their abstraction is labor intensive. Automated EHR phenotyping (AEP) can reduce the need for manual chart review. We present an AEP model that is designed to automatically identify patients diagnosed with epilepsy. METHODS The ground truth for model training and evaluation was captured from a combination of structured questionnaires filled out by physicians for a subset of patients and manual chart review using customized software. Modeling features included indicators of the presence of keywords and phrases in unstructured clinical notes, prescriptions for antiseizure medications (ASMs), International Classification of Diseases (ICD) codes for seizures and epilepsy, number of ASMs and epilepsy-related ICD codes, age, and sex. Data were randomly divided into training (70%) and hold-out testing (30%) sets, with distinct patients in each set. We trained regularized logistic regression and an extreme gradient boosting models. Model performance was measured using area under the receiver operating curve (AUROC) and area under the precision-recall curve (AUPRC), with 95% confidence intervals (CI) estimated via bootstrapping. RESULTS Our study cohort included 3903 adults drawn from outpatient departments of nine hospitals between February 2015 and June 2022 (mean age = 47 ± 18 years, 57% women, 82% White, 84% non-Hispanic, 70% with epilepsy). The final models included 285 features, including 246 keywords and phrases captured from 8415 encounters. Both models achieved AUROC and AUPRC of 1 (95% CI = .99-1.00) in the hold-out testing set. SIGNIFICANCE A machine learning-based AEP approach accurately identifies patients with epilepsy from notes, ICD codes, and ASMs. This model can enable large-scale epilepsy research using EHR databases.
Collapse
Affiliation(s)
- Marta Fernandes
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
- Clinical Data Animation Center, Massachusetts General Hospital, Boston, Massachusetts, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Aidan Cardall
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
- Clinical Data Animation Center, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Jin Jing
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
- Clinical Data Animation Center, Massachusetts General Hospital, Boston, Massachusetts, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Wendong Ge
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
- Clinical Data Animation Center, Massachusetts General Hospital, Boston, Massachusetts, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Lidia M. V. R. Moura
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | - Claire Jacobs
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | - Christopher McGraw
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | - Sahar F. Zafar
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | - M. Brandon Westover
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
- Clinical Data Animation Center, Massachusetts General Hospital, Boston, Massachusetts, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, Massachusetts, USA
| |
Collapse
|
12
|
Spalding WM, Bertoia ML, Bulik CM, Seeger JD. Treatment characteristics among patients with binge-eating disorder: an electronic health records analysis. Postgrad Med 2023; 135:254-264. [PMID: 35037815 DOI: 10.1080/00325481.2021.2018255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
OBJECTIVES Treatment for adults diagnosed with binge-eating disorder (BED) includes psychotherapy and/or pharmacotherapy and aims to reduce the frequency of binge-eating episodes and disordered eating, improve metabolic-related issues and reduce weight, and address mood symptoms. Data describing real-world treatment patterns are lacking; therefore, this study aims to characterize real-world treatment patterns among patients with BED. METHODS This retrospective study identified adult patients with BED using natural language processing of clinical notes from the Optum electronic health record database from 2009 to 2015. Treatment patterns were examined during the 12 months preceding the BED recognition date and during a follow-up period after BED recognition (1-3 years for most patients). RESULTS Among 1042 patients, 384 were categorized as the BED cohort and 658, who met less stringent criteria, were categorized as probable BED. In the BED cohort, mean ± SD age was 45.2 ± 13.4 years and 81.8% were women (probable BED, 45.9 ± 12.8 years, 80.2%). A greater percentage of patients in the BED cohort were prescribed pharmacotherapy (70.6% [probable BED, 66.9%]) than received/discussed psychotherapy (53.1% [probable BED, 39.2%]) at baseline. In the BED cohort, 54.4% of patients were prescribed antidepressants (probable BED, 52.4%), 25.3% stimulants (probable BED, 20.1%), and 34.4% nonspecific psychotherapy (probable BED, 24.6%) at baseline, with no substantive differences observed during follow-up. Low percentages of patients in the BED cohort received/discussed cognitive behavioral therapy at baseline (12.5% [probable BED, 9.0%) or during follow-up (13.0% [probable BED, 8.8%). Among patients with ≥1 psychotherapy visit, the mean ± SD number of visits in the BED cohort was 1.2 ± 5.9 at baseline (probable BED, 1.7 ± 7.3) and 2.2 ± 7.7 during follow-up (probable BED, 2.6 ± 7.7). CONCLUSION This cohort of patients with BED was treated more frequently with pharmacotherapy than psychotherapy. These data may help inform strategies for reducing differences between real-world treatment patterns and evidence-based recommendations.
Collapse
Affiliation(s)
| | | | - Cynthia M Bulik
- Department of Psychiatry, University of North Carolina School of Medicine, Chapel Hill, NC, USA
- Department of Nutrition, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | | |
Collapse
|
13
|
Wong S, Simmons A, Rivera-Villicana J, Barnett S, Sivathamboo S, Perucca P, Ge Z, Kwan P, Kuhlmann L, Vasa R, Mouzakis K, O'Brien TJ. EEG datasets for seizure detection and prediction- A review. Epilepsia Open 2023. [PMID: 36740244 DOI: 10.1002/epi4.12704] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 01/28/2023] [Indexed: 02/07/2023] Open
Abstract
Electroencephalogram (EEG) datasets from epilepsy patients have been used to develop seizure detection and prediction algorithms using machine learning (ML) techniques with the aim of implementing the learned model in a device. However, the format and structure of publicly available datasets are different from each other, and there is a lack of guidelines on the use of these datasets. This impacts the generatability, generalizability, and reproducibility of the results and findings produced by the studies. In this narrative review, we compiled and compared the different characteristics of the publicly available EEG datasets that are commonly used to develop seizure detection and prediction algorithms. We investigated the advantages and limitations of the characteristics of the EEG datasets. Based on our study, we identified 17 characteristics that make the EEG datasets unique from each other. We also briefly looked into how certain characteristics of the publicly available datasets affect the performance and outcome of a study, as well as the influences it has on the choice of ML techniques and preprocessing steps required to develop seizure detection and prediction algorithms. In conclusion, this study provides a guideline on the choice of publicly available EEG datasets to both clinicians and scientists working to develop a reproducible, generalizable, and effective seizure detection and prediction algorithm.
Collapse
Affiliation(s)
- Sheng Wong
- Applied Artificial Intelligence Institute, Deakin University, Burwood, Victoria, Australia
| | - Anj Simmons
- Applied Artificial Intelligence Institute, Deakin University, Burwood, Victoria, Australia
| | | | - Scott Barnett
- Applied Artificial Intelligence Institute, Deakin University, Burwood, Victoria, Australia
| | - Shobi Sivathamboo
- Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, Victoria, Australia.,Department of Neurology, The Royal Melbourne Hospital, Parkville, Victoria, Australia.,Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia.,Department of Neurology, Alfred Health, Melbourne, Victoria, Australia
| | - Piero Perucca
- Department of Neurology, The Royal Melbourne Hospital, Parkville, Victoria, Australia.,Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia.,Department of Neurology, Alfred Health, Melbourne, Victoria, Australia.,Department of Medicine, Austin Health, The University of Melbourne, Heidelberg, Victoria, Australia.,Comprehensive Epilepsy Program, Austin Health, Heidelberg, Victoria, Australia
| | - Zongyuan Ge
- Monash eResearch Centre, Monash University, Clayton, Victoria, Australia
| | - Patrick Kwan
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia.,Department of Neurology, Alfred Health, Melbourne, Victoria, Australia
| | - Levin Kuhlmann
- Department of Data Science and AI, Faculty of IT, Monash University, Clayton, Victoria, Australia.,Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Victoria, Australia
| | - Rajesh Vasa
- Applied Artificial Intelligence Institute, Deakin University, Burwood, Victoria, Australia
| | - Kon Mouzakis
- Applied Artificial Intelligence Institute, Deakin University, Burwood, Victoria, Australia
| | - Terence J O'Brien
- Department of Medicine, The Royal Melbourne Hospital, The University of Melbourne, Parkville, Victoria, Australia.,Department of Neurology, The Royal Melbourne Hospital, Parkville, Victoria, Australia.,Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia.,Department of Neurology, Alfred Health, Melbourne, Victoria, Australia
| |
Collapse
|
14
|
Yew ANJ, Schraagen M, Otte WM, van Diessen E. Transforming epilepsy research: A systematic review on natural language processing applications. Epilepsia 2023; 64:292-305. [PMID: 36462150 PMCID: PMC10108221 DOI: 10.1111/epi.17474] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 11/23/2022] [Accepted: 12/01/2022] [Indexed: 12/05/2022]
Abstract
Despite improved ancillary investigations in epilepsy care, patients' narratives remain indispensable for diagnosing and treatment monitoring. This wealth of information is typically stored in electronic health records and accumulated in medical journals in an unstructured manner, thereby restricting complete utilization in clinical decision-making. To this end, clinical researchers increasing apply natural language processing (NLP)-a branch of artificial intelligence-as it removes ambiguity, derives context, and imbues standardized meaning from free-narrative clinical texts. This systematic review presents an overview of the current NLP applications in epilepsy and discusses the opportunities and drawbacks of NLP alongside its future implications. We searched the PubMed and Embase databases with a "natural language processing" and "epilepsy" query (March 4, 2022) and included original research articles describing the application of NLP techniques for textual analysis in epilepsy. Twenty-six studies were included. Fifty-eight percent of these studies used NLP to classify clinical records into predefined categories, improving patient identification and treatment decisions. Other applications of NLP had structured clinical information retrieval from electronic health records, scientific papers, and online posts of patients. Challenges and opportunities of NLP applications for enhancing epilepsy care and research are discussed. The field could further benefit from NLP by replicating successes in other health care domains, such as NLP-aided quality evaluation for clinical decision-making, outcome prediction, and clinical record summarization.
Collapse
Affiliation(s)
- Arister N J Yew
- University College Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Marijn Schraagen
- Department of Information and Computing Sciences, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Willem M Otte
- Department of Child Neurology, Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Eric van Diessen
- Department of Child Neurology, Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
15
|
Juang WC, Hsu MH, Cai ZX, Chen CM. Developing an AI-assisted clinical decision support system to enhance in-patient holistic health care. PLoS One 2022; 17:e0276501. [PMID: 36315554 PMCID: PMC9621444 DOI: 10.1371/journal.pone.0276501] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 10/08/2022] [Indexed: 11/06/2022] Open
Abstract
Holistic health care (HHC) is a synonym for complete patient care, and as such an efficient clinical decision support system (CDSS) for HHC is critical to support the judgement of physician’s decision in response of patient’s physical, emotional, social, economic, and spiritual needs. The field of artificial intelligence (AI) has evolved considerably in the past decades and many AI applications have been deployed in various contexts. Therefore, this study aims to propose an AI-assisted CDSS model that predicts patients in need of HHC and applies an improved recurrent neural network (RNN) model, long short-term memory (LSTM) for the prediction. The data sources include in-patient’s comorbidity status and daily vital sign attributes such as blood pressure, heart rate, oxygen prescription, etc. A two-year dataset consisting of 121 thousand anonymized patient cases with 890 thousand physiological medical records was obtained from a medical center in Taiwan for system evaluation. Comparing with the rule-based expert system, the proposed AI-assisted CDSS improves sensitivity from 26.44% to 80.84% and specificity from 99.23% to 99.95%. The experimental results demonstrate that an AI-assisted CDSS could efficiently predict HHC patients.
Collapse
Affiliation(s)
- Wang-Chuan Juang
- Quality Management Center, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan
- Department of Business Management, National Sun Yat-sen University, Kaohsiung, Taiwan
- * E-mail: (WCJ); (CMC)
| | - Ming-Hsia Hsu
- Department of Information Management, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan
- Department of Information Management, National Sun Yat-sen University, Kaohsiung Taiwan
| | - Zheng-Xun Cai
- Department of Information Management, National Sun Yat-sen University, Kaohsiung Taiwan
| | - Chia-Mei Chen
- Department of Information Management, National Sun Yat-sen University, Kaohsiung Taiwan
- * E-mail: (WCJ); (CMC)
| |
Collapse
|
16
|
Decker BM, Turco A, Xu J, Terman SW, Kosaraju N, Jamil A, Davis KA, Litt B, Ellis CA, Khankhanian P, Hill CE. Development of a natural language processing algorithm to extract seizure types and frequencies from the electronic health record. Seizure 2022; 101:48-51. [PMID: 35882104 PMCID: PMC9547963 DOI: 10.1016/j.seizure.2022.07.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/16/2022] [Accepted: 07/18/2022] [Indexed: 11/21/2022] Open
Abstract
OBJECTIVE To develop a natural language processing (NLP) algorithm to abstract seizure types and frequencies from electronic health records (EHR). BACKGROUND Seizure frequency measurement is an epilepsy quality metric. Yet, abstraction of seizure frequency from the EHR is laborious. We present an NLP algorithm to extract seizure data from unstructured text of clinic notes. Algorithm performance was assessed at two epilepsy centers. METHODS We developed a rules-based NLP algorithm to recognize terms related to seizures and frequency within the text of an outpatient encounter. Algorithm output (e.g. number of seizures of a particular type within a time interval) was compared to seizure data manually annotated by two expert reviewers ("gold standard"). The algorithm was developed from 150 clinic notes from institution #1 (development set), then tested on a separate set of 219 notes from institution #1 (internal test set) with 248 unique seizure frequency elements. The algorithm was separately applied to 100 notes from institution #2 (external test set) with 124 unique seizure frequency elements. Algorithm performance was measured by recall (sensitivity), precision (positive predictive value), and F1 score (geometric mean of precision and recall). RESULTS In the internal test set, the algorithm demonstrated 70% recall (173/248), 95% precision (173/182), and 0.82 F1 score compared to manual review. Algorithm performance in the external test set was lower with 22% recall (27/124), 73% precision (27/37), and 0.40 F1 score. CONCLUSIONS These results suggest NLP extraction of seizure types and frequencies is feasible, though not without challenges in generalizability for large-scale implementation.
Collapse
Affiliation(s)
- Barbara M Decker
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States; Department of Neurological Sciences, University of Vermont Medical Center, Burlington, VT, United States.
| | - Alexandra Turco
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Jian Xu
- Department of Neurology, Henry Ford Health System, Detroit, MI, United States
| | - Samuel W Terman
- Department of Neurology, University of Michigan, Ann Arbor, MI, United States
| | - Nikitha Kosaraju
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Alisha Jamil
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Kathryn A Davis
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Brian Litt
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Colin A Ellis
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | | | - Chloe E Hill
- Department of Neurology, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
17
|
Fernandes M, Donahue MA, Hoch D, Cash S, Zafar S, Jacobs C, Hosford M, Voinescu PE, Fureman B, Buchhalter J, McGraw CM, Westover MB, Moura LMVR. A replicable, open-source, data integration method to support national practice-based research & quality improvement systems. Epilepsy Res 2022; 186:107013. [PMID: 35994859 PMCID: PMC9810436 DOI: 10.1016/j.eplepsyres.2022.107013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 04/28/2022] [Accepted: 08/13/2022] [Indexed: 01/07/2023]
Abstract
OBJECTIVES The Epilepsy Learning Healthcare System (ELHS) was created in 2018 to address measurable improvements in outcomes for people with epilepsy. However, fragmentation of data systems has been a major barrier for reporting and participation. In this study, we aimed to test the feasibility of an open-source Data Integration (DI) method that connects real-life clinical data to national research and quality improvement (QI) systems. METHODS The ELHS case report forms were programmed as EPIC SmartPhrases at Mass General Brigham (MGB) in December 2018 and subsequently as EPIC SmartForms in June 2021 to collect actionable, standardized, structured epilepsy data in the electronic health record (EHR) for subsequent pull into the external national registry of the ELHS. Following the QI methodology in the Chronic Care Model, 39 providers, epileptologists and neurologists, incorporated the ELHS SmartPhrase into their clinical workflow, focusing on collecting diagnosis of epilepsy, seizure type according to the International League Against Epilepsy, seizure frequency, date of last seizure, medication adherence and side effects. The collected data was stored in the Enterprise Data Warehouse (EDW) without integration with external systems. We developed and validated a DI method that extracted the data from EDW using structured query language and later preprocessed using text mining. We used the ELHS data dictionary to match fields in the preprocessed notes to obtain the final structured dataset with seizure control information. For illustration, we described the data curated from the care period of 12/2018-12/2021. RESULTS The cohort comprised a total of 1806 patients with a mean age of 43 years old (SD: 17.0), where 57% were female, 80% were white, and 84% were non-Hispanic/Latino. Using our DI method, we automated the data mining, preprocessing, and exporting of the structured dataset into a local database, to be weekly accessible to clinicians and quality improvers. During the period of SmartPhrase implementation, there were 5168 clinic visits logged by providers documenting each patient's seizure type and frequency. During this period, providers documented 59% patients having focal seizures, 35% having generalized seizures and 6% patients having another type. Of the cohort, 45% patients had private insurance. The resulting structured dataset was bulk uploaded via web interface into the external national registry of the ELHS. CONCLUSIONS Structured data can be feasibly extracted from text notes of epilepsy patients for weekly reporting to a national learning healthcare system.
Collapse
Affiliation(s)
- Marta Fernandes
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States; Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States.
| | - Maria A Donahue
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States; The NeuroValue Lab, MGH, Boston, MA, United States.
| | - Dan Hoch
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - Sydney Cash
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - Sahar Zafar
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - Claire Jacobs
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - Mackenzie Hosford
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - P Emanuela Voinescu
- Harvard Medical School, Boston, MA, United States; Department of Neurology, Division of Epilepsy, Division of Women's Health, Brigham and Women's Hospital, Boston, MA, United States.
| | | | - Jeffrey Buchhalter
- Department of Pediatrics, University of Calgary School of Medicine, Calgary, Canada.
| | - Christopher Michael McGraw
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
| | - M Brandon Westover
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States; Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States; McCance Center for Brain Health, MGH, Boston, MA, United States.
| | - Lidia M V R Moura
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States; The NeuroValue Lab, MGH, Boston, MA, United States.
| |
Collapse
|
18
|
Crema C, Attardi G, Sartiano D, Redolfi A. Natural language processing in clinical neuroscience and psychiatry: A review. Front Psychiatry 2022; 13:946387. [PMID: 36186874 PMCID: PMC9515453 DOI: 10.3389/fpsyt.2022.946387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Natural language processing (NLP) is rapidly becoming an important topic in the medical community. The ability to automatically analyze any type of medical document could be the key factor to fully exploit the data it contains. Cutting-edge artificial intelligence (AI) architectures, particularly machine learning and deep learning, have begun to be applied to this topic and have yielded promising results. We conducted a literature search for 1,024 papers that used NLP technology in neuroscience and psychiatry from 2010 to early 2022. After a selection process, 115 papers were evaluated. Each publication was classified into one of three categories: information extraction, classification, and data inference. Automated understanding of clinical reports in electronic health records has the potential to improve healthcare delivery. Overall, the performance of NLP applications is high, with an average F1-score and AUC above 85%. We also derived a composite measure in the form of Z-scores to better compare the performance of NLP models and their different classes as a whole. No statistical differences were found in the unbiased comparison. Strong asymmetry between English and non-English models, difficulty in obtaining high-quality annotated data, and train biases causing low generalizability are the main limitations. This review suggests that NLP could be an effective tool to help clinicians gain insights from medical reports, clinical research forms, and more, making NLP an effective tool to improve the quality of healthcare services.
Collapse
Affiliation(s)
- Claudio Crema
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | | | - Daniele Sartiano
- Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy
| | - Alberto Redolfi
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| |
Collapse
|
19
|
Xie K, Gallagher RS, Conrad EC, Garrick CO, Baldassano SN, Bernabei JM, Galer PD, Ghosn NJ, Greenblatt AS, Jennings T, Kornspun A, Kulick-Soper CV, Panchal JM, Pattnaik AR, Scheid BH, Wei D, Weitzman M, Muthukrishnan R, Kim J, Litt B, Ellis CA, Roth D. OUP accepted manuscript. J Am Med Inform Assoc 2022; 29:873-881. [PMID: 35190834 PMCID: PMC9006692 DOI: 10.1093/jamia/ocac018] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 01/11/2022] [Accepted: 02/08/2022] [Indexed: 11/14/2022] Open
Abstract
Objective Materials and Methods Results Discussion and Conclusion
Collapse
Affiliation(s)
- Kevin Xie
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Ryan S Gallagher
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Erin C Conrad
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Chadric O Garrick
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Steven N Baldassano
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - John M Bernabei
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Peter D Galer
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Nina J Ghosn
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Adam S Greenblatt
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Tara Jennings
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Alana Kornspun
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Catherine V Kulick-Soper
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jal M Panchal
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- The General Robotics, Automation, Sensing and Perception Laboratory, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Akash R Pattnaik
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Brittany H Scheid
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Danmeng Wei
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Micah Weitzman
- Department of Electrical and Systems Engineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Ramya Muthukrishnan
- Department of Computer and Information Science, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Joongwon Kim
- Department of Computer and Information Science, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Brian Litt
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Colin A Ellis
- Corresponding Authors: Colin A. Ellis, MD, Department of Neurology, Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA;
| | | |
Collapse
|
20
|
Buchlak QD, Esmaili N, Bennett C, Farrokhi F. Natural Language Processing Applications in the Clinical Neurosciences: A Machine Learning Augmented Systematic Review. ACTA NEUROCHIRURGICA. SUPPLEMENT 2022; 134:277-289. [PMID: 34862552 DOI: 10.1007/978-3-030-85292-4_32] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Natural language processing (NLP), a domain of artificial intelligence (AI) that models human language, has been used in medicine to automate diagnostics, detect adverse events, support decision making and predict clinical outcomes. However, applications to the clinical neurosciences appear to be limited. NLP has matured with the implementation of deep transformer models (e.g., XLNet, BERT, T5, and RoBERTa) and transfer learning. The objectives of this study were to (1) systematically review NLP applications in the clinical neurosciences, and (2) explore NLP analysis to facilitate literature synthesis, providing clear examples to demonstrate the potential capabilities of these technologies for a clinical audience. Our NLP analysis consisted of keyword identification, text summarization and document classification. A total of 48 articles met inclusion criteria. NLP has been applied in the clinical neurosciences to facilitate literature synthesis, data extraction, patient identification, automated clinical reporting and outcome prediction. The number of publications applying NLP has increased rapidly over the past five years. Document classifiers trained to differentiate included and excluded articles demonstrated moderate performance (XLNet AUC = 0.66, BERT AUC = 0.59, RoBERTa AUC = 0.62). The T5 transformer model generated acceptable abstract summaries. The application of NLP has the potential to enhance research and practice in the clinical neurosciences.
Collapse
Affiliation(s)
- Quinlan D Buchlak
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia.
| | - Nazanin Esmaili
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, Australia
| | - Christine Bennett
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
| | - Farrokh Farrokhi
- Neuroscience Institute, Virginia Mason Medical Center, Seattle, WA, USA
| |
Collapse
|
21
|
de Oliveira JM, da Costa CA, Antunes RS. Data structuring of electronic health records: a systematic review. HEALTH AND TECHNOLOGY 2021. [DOI: 10.1007/s12553-021-00607-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
22
|
Weissler EH, Naumann T, Andersson T, Ranganath R, Elemento O, Luo Y, Freitag DF, Benoit J, Hughes MC, Khan F, Slater P, Shameer K, Roe M, Hutchison E, Kollins SH, Broedl U, Meng Z, Wong JL, Curtis L, Huang E, Ghassemi M. The role of machine learning in clinical research: transforming the future of evidence generation. Trials 2021; 22:537. [PMID: 34399832 PMCID: PMC8365941 DOI: 10.1186/s13063-021-05489-x] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
Background Interest in the application of machine learning (ML) to the design, conduct, and analysis of clinical trials has grown, but the evidence base for such applications has not been surveyed. This manuscript reviews the proceedings of a multi-stakeholder conference to discuss the current and future state of ML for clinical research. Key areas of clinical trial methodology in which ML holds particular promise and priority areas for further investigation are presented alongside a narrative review of evidence supporting the use of ML across the clinical trial spectrum. Results Conference attendees included stakeholders, such as biomedical and ML researchers, representatives from the US Food and Drug Administration (FDA), artificial intelligence technology and data analytics companies, non-profit organizations, patient advocacy groups, and pharmaceutical companies. ML contributions to clinical research were highlighted in the pre-trial phase, cohort selection and participant management, and data collection and analysis. A particular focus was paid to the operational and philosophical barriers to ML in clinical research. Peer-reviewed evidence was noted to be lacking in several areas. Conclusions ML holds great promise for improving the efficiency and quality of clinical research, but substantial barriers remain, the surmounting of which will require addressing significant gaps in evidence.
Collapse
Affiliation(s)
- E Hope Weissler
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA.
| | | | | | - Rajesh Ranganath
- Courant Institute of Mathematical Science, New York University, New York, NY, USA
| | - Olivier Elemento
- Englander Institute for Precision Medicine, Weill Cornell Medical College, New York, NY, USA
| | - Yuan Luo
- Northwestern University Clinical and Translational Sciences Institute, Northwestern University, Chicago, IL, USA
| | - Daniel F Freitag
- Division Pharmaceuticals, Open Innovation and Digital Technologies, Bayer AG, Wuppertal, Germany
| | - James Benoit
- University of Alberta, Edmonton, Alberta, Canada
| | - Michael C Hughes
- Department of Computer Science, Tufts University, Medford, MA, USA
| | | | | | | | | | | | - Scott H Kollins
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA
| | - Uli Broedl
- Boehringer-Ingelheim, Burlington, Canada
| | | | | | - Lesley Curtis
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA
| | - Erich Huang
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA.,Duke Forge, Durham, NC, USA
| | - Marzyeh Ghassemi
- Vector Institute, University of Toronto, Toronto, Ontario, Canada.,Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.,Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.,CIFAR AI Chair, Vector Institute, Toronto, Ontario, Canada
| |
Collapse
|
23
|
Dobbie S, Strafford H, Pickrell WO, Fonferko-Shadrach B, Jones C, Akbari A, Thompson S, Lacey A. Markup: A Web-Based Annotation Tool Powered by Active Learning. Front Digit Health 2021; 3:598916. [PMID: 34713086 PMCID: PMC8521860 DOI: 10.3389/fdgth.2021.598916] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 06/16/2021] [Indexed: 11/13/2022] Open
Abstract
Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructured nature of free-text data poses a significant challenge for its utilisation due to the necessity of substantial manual intervention from domain-experts to label embedded information. Annotation tools can assist with this process by providing functionality that enables the accurate capture and transformation of unstructured texts into structured annotations, which can be used individually, or as part of larger Natural Language Processing (NLP) pipelines. We present Markup (https://www.getmarkup.com/) an open-source, web-based annotation tool that is undergoing continued development for use across all domains. Markup incorporates NLP and Active Learning (AL) technologies to enable rapid and accurate annotation using custom user configurations, predictive annotation suggestions, and automated mapping suggestions to both domain-specific ontologies, such as the Unified Medical Language System (UMLS), and custom, user-defined ontologies. We demonstrate a real-world use case of how Markup has been used in a healthcare setting to annotate structured information from unstructured clinic letters, where captured annotations were used to build and test NLP applications.
Collapse
Affiliation(s)
- Samuel Dobbie
- Health Data Research UK, Swansea University Medical School, Swansea University, Swansea, United Kingdom
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Huw Strafford
- Health Data Research UK, Swansea University Medical School, Swansea University, Swansea, United Kingdom
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - W. Owen Pickrell
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
- Neurology Department, Morriston Hospital, Swansea Bay University Health Board, Swansea, United Kingdom
| | | | - Carys Jones
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Ashley Akbari
- Health Data Research UK, Swansea University Medical School, Swansea University, Swansea, United Kingdom
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Simon Thompson
- Health Data Research UK, Swansea University Medical School, Swansea University, Swansea, United Kingdom
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Arron Lacey
- Health Data Research UK, Swansea University Medical School, Swansea University, Swansea, United Kingdom
- Swansea University Medical School, Swansea University, Swansea, United Kingdom
| |
Collapse
|
24
|
Ford E, Curlewis K, Squires E, Griffiths LJ, Stewart R, Jones KH. The Potential of Research Drawing on Clinical Free Text to Bring Benefits to Patients in the United Kingdom: A Systematic Review of the Literature. Front Digit Health 2021; 3:606599. [PMID: 34713089 PMCID: PMC8521813 DOI: 10.3389/fdgth.2021.606599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/15/2021] [Indexed: 11/13/2022] Open
Abstract
Background: The analysis of clinical free text from patient records for research has potential to contribute to the medical evidence base but access to clinical free text is frequently denied by data custodians who perceive that the privacy risks of data-sharing are too high. Engagement activities with patients and regulators, where views on the sharing of clinical free text data for research have been discussed, have identified that stakeholders would like to understand the potential clinical benefits that could be achieved if access to free text for clinical research were improved. We aimed to systematically review all UK research studies which used clinical free text and report direct or potential benefits to patients, synthesizing possible benefits into an easy to communicate taxonomy for public engagement and policy discussions. Methods: We conducted a systematic search for articles which reported primary research using clinical free text, drawn from UK health record databases, which reported a benefit or potential benefit for patients, actionable in a clinical environment or health service, and not solely methods development or data quality improvement. We screened eligible papers and thematically analyzed information about clinical benefits reported in the paper to create a taxonomy of benefits. Results: We identified 43 papers and derived five themes of benefits: health-care quality or services improvement, observational risk factor-outcome research, drug prescribing safety, case-finding for clinical trials, and development of clinical decision support. Five papers compared study quality with and without free text and found an improvement of accuracy when free text was included in analytical models. Conclusions: Findings will help stakeholders weigh the potential benefits of free text research against perceived risks to patient privacy. The taxonomy can be used to aid public and policy discussions, and identified studies could form a public-facing repository which will help the health-care text analysis research community better communicate the impact of their work.
Collapse
Affiliation(s)
- Elizabeth Ford
- Department of Primary Care and Public Health, Brighton and Sussex Medical School, Brighton, United Kingdom
| | - Keegan Curlewis
- Department of Primary Care and Public Health, Brighton and Sussex Medical School, Brighton, United Kingdom
| | - Emma Squires
- Swansea Medical School, University of Swansea, Swansea, United Kingdom
| | - Lucy J. Griffiths
- Swansea Medical School, University of Swansea, Swansea, United Kingdom
| | - Robert Stewart
- King's College London, London, United Kingdom
- South London and Maudsley NHS Foundation Trust, London, United Kingdom
| | - Kerina H. Jones
- Swansea Medical School, University of Swansea, Swansea, United Kingdom
| |
Collapse
|
25
|
Decker BM, Hill CE, Baldassano SN, Khankhanian P. Can antiepileptic efficacy and epilepsy variables be studied from electronic health records? A review of current approaches. Seizure 2021; 85:138-144. [PMID: 33461032 DOI: 10.1016/j.seizure.2020.11.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/16/2020] [Accepted: 11/17/2020] [Indexed: 12/16/2022] Open
Abstract
As automated data extraction and natural language processing (NLP) are rapidly evolving, improving healthcare delivery by harnessing large data is garnering great interest. Assessing antiepileptic drug (AED) efficacy and other epilepsy variables pertinent to healthcare delivery remain a critical barrier to improving patient care. In this systematic review, we examined automatic electronic health record (EHR) extraction methodologies pertinent to epilepsy. We also reviewed more generalizable NLP pipelines to extract other critical patient variables. Our review found varying reports of performance measures. Whereas automated data extraction pipelines are a crucial advancement, this review calls attention to standardizing NLP methodology and accuracy reporting for greater generalizability. Moreover, the use of crowdsourcing competitions to spur innovative NLP pipelines would further advance this field.
Collapse
Affiliation(s)
- Barbara M Decker
- Center for Neuroengineering and Therapeutics, Department of Neurology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, United States.
| | - Chloé E Hill
- Department of Neurology, University of Michigan, 1500 East Medical Center Drive, Ann Arbor, MI, 48109, United States
| | - Steven N Baldassano
- Center for Neuroengineering and Therapeutics, Department of Neurology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, United States
| | - Pouya Khankhanian
- Center for Neuroengineering and Therapeutics, Department of Neurology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, United States
| |
Collapse
|
26
|
Identification of seizure clusters using free text notes in an electronic seizure diary. Epilepsy Behav 2020; 113:107498. [PMID: 33096508 DOI: 10.1016/j.yebeh.2020.107498] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 09/09/2020] [Accepted: 09/12/2020] [Indexed: 11/21/2022]
Abstract
SIGNIFICANCE Online seizure diaries offer a wealth of information regarding real world experience of patients living with epilepsy. Free text notes (FTN) written by patients reflect concerns and priorities of patients and provide supplemental information to structured diary data. OBJECTIVE This project evaluated feasibility using an automated lexical analysis to identify FTN relevant to seizure clusters (SCs). METHODS Data were extracted from EpiDiary™, a free electronic epilepsy diary with 42,799 unique users, generating 1,096,168 entries and 247,232 FTN. Both structured data as well as FTN were analyzed for presence of SC. A pilot study was conducted to validate an automated lexical analysis algorithm to identify SC in FTN in a sample of 98 diaries. The lexical analysis was then applied to the entire dataset. Outcomes included cluster prevalence and frequency, as well as the types of triggers commonly reported. RESULTS At least one FTN was found among 13,987 (32.68%) individual diaries. An automated lexical analysis algorithm identified 5797 of FTN as SC. There were 2423 unique patients with SC that were not identified by structured data alone and were identified using lexical analysis of FTN only. Seizure clusters were identified in n = 10,331 (24.1%) of diary users through both structured data and FTN. The median number of SCs days per year was 13.7, (interquartile rank (IQR): 3.2-54.7). The median number of seizures in a cluster day was 3 (IQR 2-4). The most common missed medication linked to patients with SC was levetiracetam (n = 576, 29%) followed by lamotrigine (n = 495, 24%), topiramate (n = 208, 10.5%), carbamazepine (n = 190, 9.6%), and lacosamide (n = 170, 8.6%). These percentages generally reflected prevalence of medication use in this population. The use of rescue medications was documented in 3306 of structured entries and 4305 in FTN. CONCLUSION This exploratory study demonstrates a novel approach applying lexical analysis to previously untapped FTN in a large electronic seizure diary database. Free text notes captured information about SC not available from the structured diary data. Diary FTN contain information of high importance to people with epilepsy, written in their own words.
Collapse
|
27
|
Weissler EH, Zhang J, Lippmann S, Rusincovitch S, Henao R, Jones WS. Use of Natural Language Processing to Improve Identification of Patients With Peripheral Artery Disease. Circ Cardiovasc Interv 2020; 13:e009447. [PMID: 33040585 DOI: 10.1161/circinterventions.120.009447] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
BACKGROUND Peripheral artery disease (PAD) is underrecognized, undertreated, and understudied: each of these endeavors requires efficient and accurate identification of patients with PAD. Currently, PAD patient identification relies on diagnosis/procedure codes or lists of patients diagnosed or treated by specific providers in specific locations and ways. The goal of this research was to leverage natural language processing to more accurately identify patients with PAD in an electronic health record system compared with a structured data-based approach. METHODS The clinical notes from a cohort of 6861 patients in our health system whose PAD status had previously been adjudicated were used to train, test, and validate a natural language processing model using 10-fold cross-validation. The performance of this model was described using the area under the receiver operating characteristic and average precision curves; its performance was quantitatively compared with an administrative data-based least absolute shrinkage and selection operator (LASSO) approach using the DeLong test. RESULTS The median (SD) of the area under the receiver operating characteristic curve for the natural language processing model was 0.888 (0.009) versus 0.801 (0.017) for the LASSO-based approach alone (DeLong P<0.0001). The median (SD) of the area under the precision curve was 0.909 (0.008) versus 0.816 (0.012) for the structured data-based approach. When sensitivity was set at 90%, the precision for LASSO was 65% and the machine learning approach was 74%, while the specificity for LASSO was 41% and for the machine learning approach was 62%. CONCLUSIONS Using a natural language processing approach in addition to partial cohort preprocessing with a LASSO-based model, we were able to meaningfully improve our ability to identify patients with PAD compared with an approach using structured data alone. This model has potential applications to both interventions targeted at improving patient care as well as efficient, large-scale PAD research. Graphic Abstract: A graphic abstract is available for this article.
Collapse
Affiliation(s)
- E Hope Weissler
- Division of Vascular and Endovascular Surgery (E.H.W.), Duke University School of Medicine, Durham, NC
| | - Jikai Zhang
- Department of Biostatistics and Bioinformatics (J.Z., R.H.), Duke University School of Medicine, Durham, NC
| | - Steven Lippmann
- Department of Population Health Sciences (S.L., W.S.J.), Duke University School of Medicine, Durham, NC
| | | | - Ricardo Henao
- Department of Biostatistics and Bioinformatics (J.Z., R.H.), Duke University School of Medicine, Durham, NC.,Duke Forge (S.R., R.H.), Duke University School of Medicine, Durham, NC
| | - W Schuyler Jones
- Department of Population Health Sciences (S.L., W.S.J.), Duke University School of Medicine, Durham, NC.,Division of Cardiology (W.S.J.), Duke University School of Medicine, Durham, NC
| |
Collapse
|
28
|
Jones KH, Ford EM, Lea N, Griffiths LJ, Hassan L, Heys S, Squires E, Nenadic G. Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper. J Med Internet Res 2020; 22:e16760. [PMID: 32597785 PMCID: PMC7367542 DOI: 10.2196/16760] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 03/06/2020] [Accepted: 03/23/2020] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Clinical free-text data (eg, outpatient letters or nursing notes) represent a vast, untapped source of rich information that, if more accessible for research, would clarify and supplement information coded in structured data fields. Data usually need to be deidentified or anonymized before they can be reused for research, but there is a lack of established guidelines to govern effective deidentification and use of free-text information and avoid damaging data utility as a by-product. OBJECTIVE This study aimed to develop recommendations for the creation of data governance standards to integrate with existing frameworks for personal data use, to enable free-text data to be used safely for research for patient and public benefit. METHODS We outlined data protection legislation and regulations relating to the United Kingdom for context and conducted a rapid literature review and UK-based case studies to explore data governance models used in working with free-text data. We also engaged with stakeholders, including text-mining researchers and the general public, to explore perceived barriers and solutions in working with clinical free-text. RESULTS We proposed a set of recommendations, including the need for authoritative guidance on data governance for the reuse of free-text data, to ensure public transparency in data flows and uses, to treat deidentified free-text data as potentially identifiable with use limited to accredited data safe havens, and to commit to a culture of continuous improvement to understand the relationships between the efficacy of deidentification and reidentification risks, so this can be communicated to all stakeholders. CONCLUSIONS By drawing together the findings of a combination of activities, we present a position paper to contribute to the development of data governance standards for the reuse of clinical free-text data for secondary purposes. While working in accordance with existing data governance frameworks, there is a need for further work to take forward the recommendations we have proposed, with commitment and investment, to assure and expand the safe reuse of clinical free-text data for public benefit.
Collapse
Affiliation(s)
- Kerina H Jones
- Population Data Science, Medical School, Swansea University, Swansea, United Kingdom
| | | | - Nathan Lea
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Lucy J Griffiths
- Population Data Science, Medical School, Swansea University, Swansea, United Kingdom
| | - Lamiece Hassan
- Division of Informatics, Imaging & Data Sciences, University of Manchester, Manchester, United Kingdom
| | - Sharon Heys
- Population Data Science, Medical School, Swansea University, Swansea, United Kingdom
| | - Emma Squires
- Population Data Science, Medical School, Swansea University, Swansea, United Kingdom
| | - Goran Nenadic
- Department of Computer Science, University of Manchester & The Alan Turing Institute, Manchester, United Kingdom
| |
Collapse
|
29
|
Chiang KL, Huang CY, Hsieh LP, Chang KP. A propositional AI system for supporting epilepsy diagnosis based on the 2017 epilepsy classification: Illustrated by Dravet syndrome. Epilepsy Behav 2020; 106:107021. [PMID: 32224446 DOI: 10.1016/j.yebeh.2020.107021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/25/2019] [Revised: 03/02/2020] [Accepted: 03/02/2020] [Indexed: 01/01/2023]
Abstract
PURPOSE The 2017 epilepsy and seizure diagnosis framework emphasizes epilepsy syndromes and the etiology-based approach. We developed a propositional artificial intelligence (AI) system based on the above concepts to support physicians in the diagnosis of epilepsy. METHODS We analyzed and built ontology knowledge for the classification of seizure patterns, epilepsy, epilepsy syndrome, and etiologies. Protégé ontology tool was applied in this study. In order to enable the system to be close to the inferential thinking of clinical experts, we classified and constructed knowledge of other epilepsy-related knowledge, including comorbidities, epilepsy imitators, epilepsy descriptors, characteristic electroencephalography (EEG) findings, treatments, etc. We used the Ontology Web Language with Description Logic (OWL-DL) and Semantic Web Rule Language (SWRL) to design rules for expressing the relationship between these ontologies. RESULTS Dravet syndrome was taken as an illustration for epilepsy syndromes implementation. We designed an interface for the physician to enter the various characteristics of the patients. Clinical data of an 18-year-old boy with epilepsy was applied to the AI system. Through SWRL and reasoning engine Drool's execution, we successfully demonstrate the process of differential diagnosis. CONCLUSION We developed a propositional AI system by using the OWL-DL/SWRL approach to deal with the complexity of current epilepsy diagnosis. The experience of this system, centered on the clinical epilepsy syndromes, paves a path to construct an AI system for further complicated epilepsy diagnosis.
Collapse
Affiliation(s)
- Kuo-Liang Chiang
- Department of Pediatric Neurology, Kuang-Tien General Hospital, No. 117, Shatian Road, Shalu District, Taichung 43303, Taiwan; Department of Nutrition, Hungkuang University, No. 1018, Section 6, Taiwan Boulevard, Shalu District, Taichung 43302, Taiwan; Department of Industrial Engineering and Enterprise Information, Tunghai University, P.O. Box 985, Taichung 40704, Taiwan.
| | - Chin-Yin Huang
- Department of Industrial Engineering and Enterprise Information, Tunghai University, P.O. Box 985, Taichung 40704, Taiwan; Program for Health Administration, Tunghai University, P.O. Box 985, Taichung 40704, Taiwan.
| | - Liang-Po Hsieh
- Department of Neurology, Cheng-Ching Hospital, No. 966, Section 4, Taiwan Boulevard, Xitun District, Taichung 40764, Taiwan
| | - Kai-Ping Chang
- Department of Pediatric Neurology, Taipei Veterans General Hospital, No.201, Section 2, Shipai Rd., Beitou District, Taipei 11217, Taiwan
| |
Collapse
|
30
|
Hwang JE, Seoung BO, Lee SO, Shin SY. Implementing Structured Clinical Templates at a Single Tertiary Hospital: Survey Study. JMIR Med Inform 2020; 8:e13836. [PMID: 32352392 PMCID: PMC7226057 DOI: 10.2196/13836] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 11/26/2019] [Accepted: 02/26/2020] [Indexed: 02/06/2023] Open
Abstract
Background Electronic health record (EHR) systems have been widely adopted in hospitals. However, since current EHRs mainly focus on lowering the number of paper documents used, they have suffered from poor search function and reusability capabilities. To overcome these drawbacks, structured clinical templates have been proposed; however, they are not widely used owing to the inconvenience of data entry. Objective This study aims to verify the usability of structured templates by comparing data entry times. Methods A Korean tertiary hospital has implemented structured clinical templates with the modeling of clinical contents for the last 6 years. As a result, 1238 clinical content models (ie, body measurements, vital signs, and allergies) have been developed and 492 models for 13 clinical templates, including pathology reports, were applied to EHRs for clinical practice. Then, to verify the usability of the structured templates, data entry times from free-texts and four structured pathology report templates were compared using 4391 entries from structured data entry (SDE) log data and 4265 entries from free-text log data. In addition, a paper-based survey and a focus group interview were conducted with 23 participants from three different groups, including EHR developers, pathology transcriptionists, and clinical data extraction team members. Results Based on the analysis of time required for data entry, in most cases, beginner users of the structured clinical templates required at most 70.18% more time for data entry. However, as users became accustomed to the templates, they were able to enter data more quickly than via free-text entry: at least 1 minute and 23 seconds (16.8%) up to 5 minutes and 42 seconds (27.6%). Interestingly, well-designed thyroid cancer pathology reports required 14.54% less data entry time from the beginning of the SDE implementation. In the interviews and survey, we confirmed that most of the interviewees agreed on the need for structured templates. However, they were skeptical about structuring all the items included in the templates. Conclusions The increase in initial elapsed time led users to hold a negative opinion of SDE, despite its benefits. To overcome these obstacles, it is necessary to structure the clinical templates for optimum use. In addition, user experience in terms of ease of data entry must be considered as an essential aspect in the development of structured clinical templates.
Collapse
Affiliation(s)
- Ji Eun Hwang
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul, Republic of Korea
| | - Byung Ook Seoung
- Office of Medical Information, Asan Medical Center, Seoul, Republic of Korea
| | - Sang-Oh Lee
- Office of Medical Information, Asan Medical Center, Seoul, Republic of Korea.,Department of Infectious Diseases, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Soo-Yong Shin
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul, Republic of Korea
| |
Collapse
|
31
|
Baldassano SN, Hill CE, Shankar A, Bernabei J, Khankhanian P, Litt B. Big data in status epilepticus. Epilepsy Behav 2019; 101:106457. [PMID: 31444029 PMCID: PMC6944751 DOI: 10.1016/j.yebeh.2019.106457] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Accepted: 07/26/2019] [Indexed: 12/23/2022]
Abstract
Status epilepticus care and treatment are already being touched by the revolution in data science. New approaches designed to leverage the tremendous potential of "big data" in the clinical sphere are enabling researchers and clinicians to extract information from sources such as administrative claims data, the electronic medical health record, and continuous physiologic monitoring data streams. Algorithmic methods of data extraction also offer potential to fuse multimodal data (including text-based documentation, imaging data, and time-series data) to improve patient assessment and stratification beyond the manual capabilities of individual physicians. Still, the potential of data science to impact the diagnosis, treatment, and minute-to-minute care of patients with status epilepticus is only starting to be appreciated. In this brief review, we discuss how data science is impacting the field and draw examples from the following three main areas: (1) analysis of insurance claims from large administrative datasets to evaluate the impact of continuous electroencephalogram (EEG) monitoring on clinical outcomes; (2) natural language processing of the electronic health record to find, classify, and stratify patients for prognostication and treatment; and (3) real-time systems for data analysis, data reduction, and multimodal data fusion to guide therapy in real time. While early, it is our hope that these examples will stimulate investigators to leverage data science, computer science, and engineering methods to improve the care and outcome of patients with status epilepticus and other neurological disorders. This article is part of the Special Issue "Proceedings of the 7th London-Innsbruck Colloquium on Status Epilepticus and Acute Seizures".
Collapse
Affiliation(s)
- Steven N. Baldassano
- Department of Bioengineering, University of Pennsylvania, 210 South 33rd Street, Philadelphia, PA 19104, United States,Center for Neuroengineering and Therapeutics, University of Pennsylvania, 240 South 33rd Street, Philadelphia, PA 19104, United States
| | - Chloé E. Hill
- Department of Neurology, University of Michigan, 1500 East Medical Center Drive, Ann Arbor, MI 48109, United States
| | - Arjun Shankar
- Department of Bioengineering, University of Pennsylvania, 210 South 33rd Street, Philadelphia, PA 19104, United States,Center for Neuroengineering and Therapeutics, University of Pennsylvania, 240 South 33rd Street, Philadelphia, PA 19104, United States
| | - John Bernabei
- Department of Bioengineering, University of Pennsylvania, 210 South 33rd Street, Philadelphia, PA 19104, United States,Center for Neuroengineering and Therapeutics, University of Pennsylvania, 240 South 33rd Street, Philadelphia, PA 19104, United States
| | - Pouya Khankhanian
- Department of Neurology, University of Michigan, 1500 East Medical Center Drive, Ann Arbor, MI 48109, United States,Department of Neurology, Penn Epilepsy Center, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, United States
| | - Brian Litt
- Department of Bioengineering, University of Pennsylvania, 210 South 33rd Street, Philadelphia, PA 19104, United States,Center for Neuroengineering and Therapeutics, University of Pennsylvania, 240 South 33rd Street, Philadelphia, PA 19104, United States,Department of Neurology, Penn Epilepsy Center, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, United States
| |
Collapse
|