Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang J, Abu-El-Rub N, Gray J, Pham HA, Zhou Y, Manion FJ, Liu M, Song X, Xu H, Rouhizadeh M, Zhang Y. COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model. J Am Med Inform Assoc 2021;28:1275-1283. [PMID: 33674830 PMCID: PMC7989301 DOI: 10.1093/jamia/ocab015] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 01/29/2021] [Indexed: 11/12/2022] Open

For:	Wang J, Abu-El-Rub N, Gray J, Pham HA, Zhou Y, Manion FJ, Liu M, Song X, Xu H, Rouhizadeh M, Zhang Y. COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model. J Am Med Inform Assoc 2021;28:1275-1283. [PMID: 33674830 PMCID: PMC7989301 DOI: 10.1093/jamia/ocab015] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 01/29/2021] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

McMurry AJ, Zipursky AR, Geva A, Olson KL, Jones JR, Ignatov V, Miller TA, Mandl KD. Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study. J Med Internet Res 2024;26:e53367. [PMID: 38573752 PMCID: PMC11027052 DOI: 10.2196/53367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/30/2023] [Accepted: 02/27/2024] [Indexed: 04/05/2024] Open

Abstract

BACKGROUND

Real-time surveillance of emerging infectious diseases necessitates a dynamically evolving, computable case definition, which frequently incorporates symptom-related criteria. For symptom detection, both population health monitoring platforms and research initiatives primarily depend on structured data extracted from electronic health records.

OBJECTIVE

This study sought to validate and test an artificial intelligence (AI)-based natural language processing (NLP) pipeline for detecting COVID-19 symptoms from physician notes in pediatric patients. We specifically study patients presenting to the emergency department (ED) who can be sentinel cases in an outbreak.

METHODS

Subjects in this retrospective cohort study are patients who are 21 years of age and younger, who presented to a pediatric ED at a large academic children's hospital between March 1, 2020, and May 31, 2022. The ED notes for all patients were processed with an NLP pipeline tuned to detect the mention of 11 COVID-19 symptoms based on Centers for Disease Control and Prevention (CDC) criteria. For a gold standard, 3 subject matter experts labeled 226 ED notes and had strong agreement (F1-score=0.986; positive predictive value [PPV]=0.972; and sensitivity=1.0). F1-score, PPV, and sensitivity were used to compare the performance of both NLP and the International Classification of Diseases, 10th Revision (ICD-10) coding to the gold standard chart review. As a formative use case, variations in symptom patterns were measured across SARS-CoV-2 variant eras.

RESULTS

There were 85,678 ED encounters during the study period, including 4% (n=3420) with patients with COVID-19. NLP was more accurate at identifying encounters with patients that had any of the COVID-19 symptoms (F1-score=0.796) than ICD-10 codes (F1-score =0.451). NLP accuracy was higher for positive symptoms (sensitivity=0.930) than ICD-10 (sensitivity=0.300). However, ICD-10 accuracy was higher for negative symptoms (specificity=0.994) than NLP (specificity=0.917). Congestion or runny nose showed the highest accuracy difference (NLP: F1-score=0.828 and ICD-10: F1-score=0.042). For encounters with patients with COVID-19, prevalence estimates of each NLP symptom differed across variant eras. Patients with COVID-19 were more likely to have each NLP symptom detected than patients without this disease. Effect sizes (odds ratios) varied across pandemic eras.

CONCLUSIONS

This study establishes the value of AI-based NLP as a highly effective tool for real-time COVID-19 symptom detection in pediatric patients, outperforming traditional ICD-10 methods. It also reveals the evolving nature of symptom prevalence across different virus variants, underscoring the need for dynamic, technology-driven approaches in infectious disease surveillance.

Collapse

Xie F, Chang J, Luong T, Wu B, Lustigova E, Shrader E, Chen W. Identifying Symptoms Prior to Pancreatic Ductal Adenocarcinoma Diagnosis in Real-World Care Settings: Natural Language Processing Approach. JMIR AI 2024;3:e51240. [PMID: 38875566 PMCID: PMC11041417 DOI: 10.2196/51240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/08/2023] [Accepted: 12/16/2023] [Indexed: 06/16/2024]

Abstract

BACKGROUND

Pancreatic cancer is the third leading cause of cancer deaths in the United States. Pancreatic ductal adenocarcinoma (PDAC) is the most common form of pancreatic cancer, accounting for up to 90% of all cases. Patient-reported symptoms are often the triggers of cancer diagnosis and therefore, understanding the PDAC-associated symptoms and the timing of symptom onset could facilitate early detection of PDAC.

OBJECTIVE

This paper aims to develop a natural language processing (NLP) algorithm to capture symptoms associated with PDAC from clinical notes within a large integrated health care system.

METHODS

We used unstructured data within 2 years prior to PDAC diagnosis between 2010 and 2019 and among matched patients without PDAC to identify 17 PDAC-related symptoms. Related terms and phrases were first compiled from publicly available resources and then recursively reviewed and enriched with input from clinicians and chart review. A computerized NLP algorithm was iteratively developed and fine-trained via multiple rounds of chart review followed by adjudication. Finally, the developed algorithm was applied to the validation data set to assess performance and to the study implementation notes.

RESULTS

A total of 408,147 and 709,789 notes were retrieved from 2611 patients with PDAC and 10,085 matched patients without PDAC, respectively. In descending order, the symptom distribution of the study implementation notes ranged from 4.98% for abdominal or epigastric pain to 0.05% for upper extremity deep vein thrombosis in the PDAC group, and from 1.75% for back pain to 0.01% for pale stool in the non-PDAC group. Validation of the NLP algorithm against adjudicated chart review results of 1000 notes showed that precision ranged from 98.9% (jaundice) to 84% (upper extremity deep vein thrombosis), recall ranged from 98.1% (weight loss) to 82.8% (epigastric bloating), and F1-scores ranged from 0.97 (jaundice) to 0.86 (depression).

CONCLUSIONS

The developed and validated NLP algorithm could be used for the early detection of PDAC.

Collapse

Hanson RF, Zhu V, Are F, Espeleta H, Wallis E, Heider P, Kautz M, Lenert L. Initial development of tools to identify child abuse and neglect in pediatric primary care. BMC Med Inform Decis Mak 2023;23:266. [PMID: 37978498 PMCID: PMC10656827 DOI: 10.1186/s12911-023-02361-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 11/02/2023] [Indexed: 11/19/2023] Open

Abstract

BACKGROUND

Child abuse and neglect (CAN) is prevalent, associated with long-term adversities, and often undetected. Primary care settings offer a unique opportunity to identify CAN and facilitate referrals, when warranted. Electronic health records (EHR) contain extensive information to support healthcare decisions, yet time constraints preclude most providers from thorough EHR reviews that could indicate CAN. Strategies that summarize EHR data to identify CAN and convey this to providers has potential to mitigate CAN-related sequelae. This study used expert review/consensus and Natural Language Processing (NLP) to develop and test a lexicon to characterize children who have experienced or are at risk for CAN and compared machine learning methods to the lexicon + NLP approach to determine the algorithm's performance for identifying CAN.

METHODS

Study investigators identified 90 CAN terms and invited an interdisciplinary group of child abuse experts for review and validation. We then used NLP to develop pipelines to finalize the CAN lexicon. Data for pipeline development and refinement were drawn from a randomly selected sample of EHR from patients seen at pediatric primary care clinics within a U.S. academic health center. To explore a machine learning approach for CAN identification, we used Support Vector Machine algorithms.

RESULTS

The investigator-generated list of 90 CAN terms were reviewed and validated by 25 invited experts, resulting in a final pool of 133 terms. NLP utilized a randomly selected sample of 14,393 clinical notes from 153 patients to test the lexicon, and .03% of notes were identified as CAN positive. CAN identification varied by clinical note type, with few differences found by provider type (physicians versus nurses, social workers, etc.). An evaluation of the final NLP pipelines indicated 93.8% positive CAN rate for the training set and 71.4% for the test set, with decreased precision attributed primarily to false positives. For the machine learning approach, SVM pipeline performance was 92% for CAN + and 100% for non-CAN, indicating higher sensitivity than specificity.

CONCLUSIONS

The NLP algorithm's development and refinement suggest that innovative tools can identify youth at risk for CAN. The next key step is to refine the NLP algorithm to eventually funnel this information to care providers to guide clinical decision making.

Collapse

Michalski AA, Lis K, Stankiewicz J, Kloska SM, Sycz A, Dudziński M, Muras-Szwedziak K, Nowicki M, Bazan-Socha S, Dabrowski MJ, Basak GW. Supporting the Diagnosis of Fabry Disease Using a Natural Language Processing-Based Approach. J Clin Med 2023;12:jcm12103599. [PMID: 37240705 DOI: 10.3390/jcm12103599] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/01/2023] [Accepted: 05/15/2023] [Indexed: 05/28/2023] Open

Affiliation(s)

Adrian A Michalski Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Department of Analytical Chemistry, Nicolaus Copernicus University Ludwik Rydygier Collegium Medicum, 85-089 Bydgoszcz, Poland
Karol Lis Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Department of Hematology, Transplantation and Internal Medicine, Medical University of Warsaw, 02-097 Warsaw, Poland
Joanna Stankiewicz Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Department of Pediatrics, Hematology and Oncology, Nicolaus Copernicus University Ludwik Rydygier Collegium Medicum, 85-094 Bydgoszcz, Poland
Sylwester M Kloska Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Department of Forensic Medicine, Nicolaus Copernicus University Ludwik Rydygier Collegium Medicum, 85-067 Bydgoszcz, Poland
Arkadiusz Sycz Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Faculty of Mathematics and Information Science, Warsaw University of Technology, 00-662 Warsaw, Poland
Marek Dudziński Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Department of Hematology, Institute of Medical Sciences, College of Medical Sciences, University of Rzeszow, 35-959 Rzeszow, Poland
Katarzyna Muras-Szwedziak Saventic Foundation, Polna 66/12 Street, 87-100 Torun, Poland Department of Nephrology, Hypertension and Kidney Transplantation, Medical University of Lodz, 90-419 Lodz, Poland
Michał Nowicki Saventic Foundation, Polna 66/12 Street, 87-100 Torun, Poland Department of Nephrology, Hypertension and Kidney Transplantation, Medical University of Lodz, 90-419 Lodz, Poland
Stanisława Bazan-Socha Saventic Foundation, Polna 66/12 Street, 87-100 Torun, Poland Department of Internal Medicine, Faculty of Medicine, Jagiellonian University Medical College, 31-008 Krakow, Poland
Michal J Dabrowski Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Computational Biology Group, Institute of Computer Science of the Polish Academy of Sciences, 01-248 Warsaw, Poland
Grzegorz W Basak Saventic Health, Polna 66/12 Street, 87-100 Torun, Poland Department of Hematology, Transplantation and Internal Medicine, Medical University of Warsaw, 02-097 Warsaw, Poland

Collapse

Alshahrani SM, Khan NA. COVID-19 advising application development for Apple devices (iOS). PeerJ Comput Sci 2023;9:e1274. [PMID: 37346730 PMCID: PMC10280587 DOI: 10.7717/peerj-cs.1274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 02/13/2023] [Indexed: 06/23/2023]

Abstract

One of humanity's most devastating health crises was COVID-19. Billions of people suffered during this pandemic. In comparison with previous global pandemics that have been faced by the world before, societies were more accurate with the technical support system during this natural disaster. The intersection of data from healthcare units and the analysis of this data into various sophisticated systems were critical factors. Different healthcare units have taken special consideration to advance technical inputs to fight against such situations. The field of natural language processing (NLP) has dramatically supported this. Despite the primitive methods for monitoring the bio-metric factors of a person, the use of cognitive science has emerged as one of the most critical features during this pandemic era. One of the essential features is the potential to understand the data based on various texts and user inputs. The deployment of various NLP systems is one of the most challenging factors in handling the bulk amount of data flowing from multiple sources. This study focused on developing a powerful application to advise patients suffering from ailments related to COVID-19. The use of NLP refers to facilitating a user to identify the present critical situation and make necessary decisions while getting infected. This article also summarises the challenges associated with NLP and its usage for future NLP-based applications focusing on healthcare units. There are a couple of applications that reside for android-based systems as well as web-based chat-bot systems. In terms of security and safety, application development for iOS is more advanced. This study also explains the block meant of an application for advising COVID-19 infection. A natural language processing powered application for an iOS operating system is indeed one of its kind, which will help people who need to advise proper guidance. The article also portrays NLP-based application development for healthcare problems associated with personal reporting systems.

Collapse

Wang L, Foer D, Zhang Y, Karlson EW, Bates DW, Zhou L. Post-Acute COVID-19 Respiratory Symptoms in Patients With Asthma: An Electronic Health Records-Based Study. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. IN PRACTICE 2023;11:825-835.e3. [PMID: 36566779 PMCID: PMC9773736 DOI: 10.1016/j.jaip.2022.12.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/27/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022]

Abstract

BACKGROUND

Post-viral respiratory symptoms are common among patients with asthma. Respiratory symptoms after acute COVID-19 are widely reported in the general population, but large-scale studies identifying symptom risk for patients with asthma are lacking.

OBJECTIVE

To identify and compare risk for post-acute COVID-19 respiratory symptoms in patients with and without asthma.

METHODS

This retrospective, observational cohort study included COVID-19-positive patients between March 4, 2020, and January 20, 2021, with up to 180 days of health care follow-up in a health care system in the Northeastern United States. Respiratory symptoms recorded in clinical notes from days 28 to 180 after COVID-19 diagnosis were extracted using natural language processing. Cohorts were stratified by hospitalization status during the acute COVID-19 period. Univariable and multivariable analyses were used to compare symptoms among patients with and without asthma adjusting for demographic and clinical confounders.

RESULTS

Among 31,084 eligible patients with COVID-19, 2863 (9.2%) had hospitalization during the acute COVID-19 period; 4049 (13.0%) had a history of asthma, accounting for 13.8% of hospitalized and 12.9% of nonhospitalized patients. In the post-acute COVID-19 period, patients with asthma had significantly higher risk of shortness of breath, cough, bronchospasm, and wheezing than patients without an asthma history. Incident respiratory symptoms of bronchospasm and wheezing were also higher in patients with asthma. Patients with asthma who had not been hospitalized during acute COVID-19 had additionally higher risk of cough, abnormal breathing, sputum changes, and a wider range of incident respiratory symptoms.

CONCLUSION

Patients with asthma may have an under-recognized burden of respiratory symptoms after COVID-19 warranting increased awareness and monitoring in this population.

Collapse

Jeon E, Kim A, Lee J, Heo H, Lee H, Woo K. Developing a Classification Algorithm for Prediabetes Risk Detection From Home Care Nursing Notes: Using Natural Language Processing. Comput Inform Nurs 2023:00024665-990000000-00087. [PMID: 37165830 DOI: 10.1097/cin.0000000000001000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Kumar A, Sharaff A. PubExN: An Automated PubMed Bulk Article Extractor with Affiliation Normalization Package. SN COMPUTER SCIENCE 2023;4:353. [PMID: 37128512 PMCID: PMC10132428 DOI: 10.1007/s42979-023-01687-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Accepted: 01/11/2023] [Indexed: 05/03/2023]

Mavragani A, Sanchez T, Ackerson BK, Hong V, Skarbinski J, Yau V, Qian L, Fischer H, Shaw SF, Caparosa S, Xie F. Natural Language Processing for Improved Characterization of COVID-19 Symptoms: Observational Study of 350,000 Patients in a Large Integrated Health Care System. JMIR Public Health Surveill 2022;8:e41529. [PMID: 36446133 PMCID: PMC9822566 DOI: 10.2196/41529] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 11/07/2022] [Accepted: 11/29/2022] [Indexed: 12/05/2022] Open

Abstract

BACKGROUND

Natural language processing (NLP) of unstructured text from electronic medical records (EMR) can improve the characterization of COVID-19 signs and symptoms, but large-scale studies demonstrating the real-world application and validation of NLP for this purpose are limited.

OBJECTIVE

The aim of this paper is to assess the contribution of NLP when identifying COVID-19 signs and symptoms from EMR.

METHODS

This study was conducted in Kaiser Permanente Southern California, a large integrated health care system using data from all patients with positive SARS-CoV-2 laboratory tests from March 2020 to May 2021. An NLP algorithm was developed to extract free text from EMR on 12 established signs and symptoms of COVID-19, including fever, cough, headache, fatigue, dyspnea, chills, sore throat, myalgia, anosmia, diarrhea, vomiting or nausea, and abdominal pain. The proportion of patients reporting each symptom and the corresponding onset dates were described before and after supplementing structured EMR data with NLP-extracted signs and symptoms. A random sample of 100 chart-reviewed and adjudicated SARS-CoV-2-positive cases were used to validate the algorithm performance.

RESULTS

A total of 359,938 patients (mean age 40.4 [SD 19.2] years; 191,630/359,938, 53% female) with confirmed SARS-CoV-2 infection were identified over the study period. The most common signs and symptoms identified through NLP-supplemented analyses were cough (220,631/359,938, 61%), fever (185,618/359,938, 52%), myalgia (153,042/359,938, 43%), and headache (144,705/359,938, 40%). The NLP algorithm identified an additional 55,568 (15%) symptomatic cases that were previously defined as asymptomatic using structured data alone. The proportion of additional cases with each selected symptom identified in NLP-supplemented analysis varied across the selected symptoms, from 29% (63,742/220,631) of all records for cough to 64% (38,884/60,865) of all records with nausea or vomiting. Of the 295,305 symptomatic patients, the median time from symptom onset to testing was 3 days using structured data alone, whereas the NLP algorithm identified signs or symptoms approximately 1 day earlier. When validated against chart-reviewed cases, the NLP algorithm successfully identified signs and symptoms with consistently high sensitivity (ranging from 87% to 100%) and specificity (94% to 100%).

CONCLUSIONS

These findings demonstrate that NLP can identify and characterize a broad set of COVID-19 signs and symptoms from unstructured EMR data with enhanced detail and timeliness compared with structured data alone.

Collapse

Al-Garadi MA, Yang YC, Sarker A. The Role of Natural Language Processing during the COVID-19 Pandemic: Health Applications, Opportunities, and Challenges. Healthcare (Basel) 2022;10:2270. [PMID: 36421593 PMCID: PMC9690240 DOI: 10.3390/healthcare10112270] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 11/03/2022] [Accepted: 11/06/2022] [Indexed: 07/30/2023] Open

A Natural Language Processing (NLP) Evaluation on COVID-19 Rumour Dataset Using Deep Learning Techniques. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:6561622. [PMID: 36156967 PMCID: PMC9492356 DOI: 10.1155/2022/6561622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 06/18/2022] [Accepted: 07/22/2022] [Indexed: 11/17/2022]

Abstract Context and Background: Since December 2019, the coronavirus (COVID-19) epidemic has sparked considerable alarm among the general community and significantly affected societal attitudes and perceptions. Apart from the disease itself, many people suffer from anxiety and depression due to the disease and the present threat of an outbreak. Due to the fast propagation of the virus and misleading/fake information, the issues of public discourse alter, resulting in significant confusion in certain places. Rumours are unproven facts or stories that propagate and promote sentiments of prejudice, hatred, and fear. Objective. The study’s objective is to propose a novel solution to detect fake news using state-of-the-art machines and deep learning models. Furthermore, to analyse which models outperformed in detecting the fake news. Method. In the research study, we adapted a COVID-19 rumours dataset, which incorporates rumours from news websites and tweets, together with information about the rumours. It is important to analyse data utilizing Natural Language Processing (NLP) and Deep Learning (DL) approaches. Based on the accuracy, precision, recall, and the f1 score, we can assess the effectiveness of the ML and DL algorithms. Results. The data adopted from the source (mentioned in the paper) have collected 9200 comments from Google and 34,779 Twitter postings filtered for phrases connected with COVID-19-related fake news. Experiment 1. The dataset was assessed using the following three criteria: veracity, stance, and sentiment. In these terms, we have different labels, and we have applied the DL algorithms separately to each term. We have used different models in the experiment such as (i) LSTM and (ii) Temporal Convolution Networks (TCN). The TCN model has more performance on each measurement parameter in the evaluated results. So, we have used the TCN model for the practical implication for better findings. Experiment 2. In the second experiment, we have used different state-of-the-art deep learning models and algorithms such as (i) Simple RNN; (ii) LSTM + Word Embedding; (iii) Bidirectional + Word Embedding; (iv) LSTM + CNN-1D; and (v) BERT. Furthermore, we have evaluated the performance of these models on all three datasets, e.g., veracity, stance, and sentiment. Based on our second experimental evaluation, the BERT has a superior performance over the other models compared. Collapse

Shapiro M, Landau R, Shay S, Kaminski M, Verhovsky G. Early Detection of COVID-19 outbreaks using Textual Analysis of Electronic Medical Records. J Clin Virol 2022;155:105251. [PMID: 35973330 PMCID: PMC9347140 DOI: 10.1016/j.jcv.2022.105251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 07/10/2022] [Accepted: 08/02/2022] [Indexed: 11/26/2022]

Faris H, Faris M, Habib M, Alomari A. Automatic symptoms identification from a massive volume of unstructured medical consultations using deep neural and BERT models. Heliyon 2022;8:e09683. [PMID: 35761935 PMCID: PMC9233221 DOI: 10.1016/j.heliyon.2022.e09683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 04/10/2022] [Accepted: 06/01/2022] [Indexed: 11/25/2022] Open

Knosp BM, Craven CK, Dorr DA, Bernstam EV, Campion TR. Understanding enterprise data warehouses to support clinical and translational research: enterprise information technology relationships, data governance, workforce, and cloud computing. J Am Med Inform Assoc 2022;29:671-676. [PMID: 35289370 PMCID: PMC8922193 DOI: 10.1093/jamia/ocab256] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/05/2021] [Indexed: 01/22/2023] Open

Abstract

OBJECTIVE

Among National Institutes of Health Clinical and Translational Science Award (CTSA) hubs, effective approaches for enterprise data warehouses for research (EDW4R) development, maintenance, and sustainability remain unclear. The goal of this qualitative study was to understand CTSA EDW4R operations within the broader contexts of academic medical centers and technology.

MATERIALS AND METHODS

We performed a directed content analysis of transcripts generated from semistructured interviews with informatics leaders from 20 CTSA hubs.

RESULTS

Respondents referred to services provided by health system, university, and medical school information technology (IT) organizations as "enterprise information technology (IT)." Seventy-five percent of respondents stated that the team providing EDW4R service at their hub was separate from enterprise IT; strong relationships between EDW4R teams and enterprise IT were critical for success. Managing challenges of EDW4R staffing was made easier by executive leadership support. Data governance appeared to be a work in progress, as most hubs reported complex and incomplete processes, especially for commercial data sharing. Although nearly all hubs (n = 16) described use of cloud computing for specific projects, only 2 hubs reported using a cloud-based EDW4R. Respondents described EDW4R cloud migration facilitators, barriers, and opportunities.

DISCUSSION

Descriptions of approaches to how EDW4R teams at CTSA hubs work with enterprise IT organizations, manage workforces, make decisions about data, and approach cloud computing provide insights for institutions seeking to leverage patient data for research.

CONCLUSION

Identification of EDW4R best practices is challenging, and this study helps identify a breadth of viable options for CTSA hubs to consider when implementing EDW4R services.

Collapse

Hasan A, Levene M, Weston D, Fromson R, Koslover N, Levene T. Monitoring Covid-19 on social media using a novel triage and diagnosis approach. J Med Internet Res 2022;24:e30397. [PMID: 35142636 PMCID: PMC8887561 DOI: 10.2196/30397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/09/2021] [Accepted: 02/05/2022] [Indexed: 12/23/2022] Open

Abstract

Background

The COVID-19 pandemic has created a pressing need for integrating information from disparate sources in order to assist decision makers. Social media is important in this respect; however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. Here, we adopt a triage and diagnosis approach to analyzing social media posts using machine learning techniques for the purpose of disease detection and surveillance. We thus obtain useful prevalence and incidence statistics to identify disease symptoms and their severities, motivated by public health concerns.

Objective

This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts in order to provide researchers and public health practitioners with additional information on the symptoms, severity, and prevalence of the disease rather than to provide an actionable decision at the individual level.

Methods

The text processing pipeline first extracted COVID-19 symptoms and related concepts, such as severity, duration, negations, and body parts, from patients’ posts using conditional random fields. An unsupervised rule-based algorithm was then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations were subsequently used to construct 2 different vector representations of each post. These vectors were separately applied to build support vector machine learning models to triage patients into 3 categories and diagnose them for COVID-19.

Results

We reported macro- and microaveraged F₁ scores in the range of 71%-96% and 61%-87%, respectively, for the triage and diagnosis of COVID-19 when the models were trained on human-labeled data. Our experimental results indicated that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. In addition, we highlighted important features uncovered by our diagnostic machine learning models and compared them with the most frequent symptoms revealed in another COVID-19 data set. In particular, we found that the most important features are not always the most frequent ones.

Conclusions

Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from social media natural language narratives, using a machine learning pipeline in order to provide information on the severity and prevalence of the disease for use within health surveillance systems.

Collapse

Gupta T, Debele TA, Wei YF, Gupta A, Murtaza M, Su WP. Synergistic Action of Immunotherapy and Nanotherapy against Cancer Patients Infected with SARS-CoV-2 and the Use of Artificial Intelligence. Cancers (Basel) 2022;14:213. [PMID: 35008377 PMCID: PMC8750412 DOI: 10.3390/cancers14010213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 12/28/2021] [Accepted: 12/30/2021] [Indexed: 01/08/2023] Open

Yin AL, Guo WL, Sholle ET, Rajan M, Alshak MN, Choi JJ, Goyal P, Jabri A, Li HA, Pinheiro LC, Wehmeyer GT, Weiner M, Safford MM, Campion TR, Cole CL. Comparing automated vs. manual data collection for COVID-specific medications from electronic health records. Int J Med Inform 2022;157:104622. [PMID: 34741892 PMCID: PMC8529289 DOI: 10.1016/j.ijmedinf.2021.104622] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 09/19/2021] [Accepted: 10/15/2021] [Indexed: 12/29/2022]

Abstract

INTRODUCTION

Data extraction from electronic health record (EHR) systems occurs through manual abstraction, automated extraction, or a combination of both. While each method has its strengths and weaknesses, both are necessary for retrospective observational research as well as sudden clinical events, like the COVID-19 pandemic. Assessing the strengths, weaknesses, and potentials of these methods is important to continue to understand optimal approaches to extracting clinical data. We set out to assess automated and manual techniques for collecting medication use data in patients with COVID-19 to inform future observational studies that extract data from the electronic health record (EHR).

MATERIALS AND METHODS

For 4,123 COVID-positive patients hospitalized and/or seen in the emergency department at an academic medical center between 03/03/2020 and 05/15/2020, we compared medication use data of 25 medications or drug classes collected through manual abstraction and automated extraction from the EHR. Quantitatively, we assessed concordance using Cohen's kappa to measure interrater reliability, and qualitatively, we audited observed discrepancies to determine causes of inconsistencies.

RESULTS

For the 16 inpatient medications, 11 (69%) demonstrated moderate or better agreement; 7 of those demonstrated strong or almost perfect agreement. For 9 outpatient medications, 3 (33%) demonstrated moderate agreement, but none achieved strong or almost perfect agreement. We audited 12% of all discrepancies (716/5,790) and, in those audited, observed three principal categories of error: human error in manual abstraction (26%), errors in the extract-transform-load (ETL) or mapping of the automated extraction (41%), and abstraction-query mismatch (33%).

CONCLUSION

Our findings suggest many inpatient medications can be collected reliably through automated extraction, especially when abstraction instructions are designed with data architecture in mind. We discuss quality issues, concerns, and improvements for institutions to consider when crafting an approach. During crises, institutions must decide how to allocate limited resources. We show that automated extraction of medications is feasible and make recommendations on how to improve future iterations.

Collapse

Affiliation(s)

Andrew L. Yin Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,bDepartment of Medicine, Weill Cornell Medicine, New York, NY, United States,⁎Corresponding author at: 1300 York Avenue, New York, NY 10021, United States
Winston L. Guo Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States
Evan T. Sholle Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
Mangala Rajan Department of Medicine, Weill Cornell Medicine, New York, NY, United States
Mark N. Alshak Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,bDepartment of Medicine, Weill Cornell Medicine, New York, NY, United States
Justin J. Choi Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
Parag Goyal Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
Assem Jabri Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
Han A. Li Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,bDepartment of Medicine, Weill Cornell Medicine, New York, NY, United States
Laura C. Pinheiro Department of Medicine, Weill Cornell Medicine, New York, NY, United States
Graham T. Wehmeyer Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,bDepartment of Medicine, Weill Cornell Medicine, New York, NY, United States
Mark Weiner Department of Medicine, Weill Cornell Medicine, New York, NY, United States,cInformation Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
Weill Cornell COVID-19 Data Abstraction Consortium Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,cInformation Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
Monika M. Safford Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
Thomas R. Campion Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States,eDepartment of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States,fClinical and Translational Science Center, Weill Cornell Medicine, New York, NY, United States
Curtis L. Cole Department of Medicine, Weill Cornell Medicine, New York, NY, United States,eDepartment of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States

Collapse

Wang L, Foer D, MacPhaul E, Lo YC, Bates DW, Zhou L. PASCLex: A comprehensive post-acute sequelae of COVID-19 (PASC) symptom lexicon derived from electronic health record clinical notes. J Biomed Inform 2022;125:103951. [PMID: 34785382 PMCID: PMC8590503 DOI: 10.1016/j.jbi.2021.103951] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 11/06/2021] [Accepted: 11/06/2021] [Indexed: 01/04/2023]

Hripcsak G, Schuemie MJ, Madigan D, Ryan PB, Suchard MA. Drawing Reproducible Conclusions from Observational Clinical Data with OHDSI. Yearb Med Inform 2021;30:283-289. [PMID: 33882595 PMCID: PMC8416226 DOI: 10.1055/s-0041-1726481] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Chen Q, Leaman R, Allot A, Luo L, Wei CH, Yan S, Lu Z. Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annu Rev Biomed Data Sci 2021;4:313-339. [PMID: 34465169 DOI: 10.1146/annurev-biodatasci-021821-061045] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Guo Y, Zhang Y, Lyu T, Prosperi M, Wang F, Xu H, Bian J. The application of artificial intelligence and data integration in COVID-19 studies: a scoping review. J Am Med Inform Assoc 2021;28:2050-2067. [PMID: 34151987 PMCID: PMC8344463 DOI: 10.1093/jamia/ocab098] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 05/03/2021] [Accepted: 05/06/2021] [Indexed: 12/23/2022] Open

Fries JA, Steinberg E, Khattar S, Fleming SL, Posada J, Callahan A, Shah NH. Ontology-driven weak supervision for clinical entity classification in electronic health records. Nat Commun 2021;12:2017. [PMID: 33795682 PMCID: PMC8016863 DOI: 10.1038/s41467-021-22328-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 02/26/2021] [Indexed: 02/07/2023] Open

Leslie D, Mazumder A, Peppin A, Wolters MK, Hagerty A. Does "AI" stand for augmenting inequality in the era of covid-19 healthcare? BMJ 2021;372:n304. [PMID: 33722847 PMCID: PMC7958301 DOI: 10.1136/bmj.n304] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Cossio M, Gilardino RE. Would the Use of Artificial Intelligence in COVID-19 Patient Management Add Value to the Healthcare System? Front Med (Lausanne) 2021;8:619202. [PMID: 33585525 PMCID: PMC7873524 DOI: 10.3389/fmed.2021.619202] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 01/06/2021] [Indexed: 11/13/2022] Open