1
|
Patel D, Raut G, Zimlichman E, Cheetirala SN, Nadkarni GN, Glicksberg BS, Apakama DU, Bell EJ, Freeman R, Timsina P, Klang E. Evaluating prompt engineering on GPT-3.5's performance in USMLE-style medical calculations and clinical scenarios generated by GPT-4. Sci Rep 2024; 14:17341. [PMID: 39069520 DOI: 10.1038/s41598-024-66933-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 07/05/2024] [Indexed: 07/30/2024] Open
Abstract
This study was designed to assess how different prompt engineering techniques, specifically direct prompts, Chain of Thought (CoT), and a modified CoT approach, influence the ability of GPT-3.5 to answer clinical and calculation-based medical questions, particularly those styled like the USMLE Step 1 exams. To achieve this, we analyzed the responses of GPT-3.5 to two distinct sets of questions: a batch of 1000 questions generated by GPT-4, and another set comprising 95 real USMLE Step 1 questions. These questions spanned a range of medical calculations and clinical scenarios across various fields and difficulty levels. Our analysis revealed that there were no significant differences in the accuracy of GPT-3.5's responses when using direct prompts, CoT, or modified CoT methods. For instance, in the USMLE sample, the success rates were 61.7% for direct prompts, 62.8% for CoT, and 57.4% for modified CoT, with a p-value of 0.734. Similar trends were observed in the responses to GPT-4 generated questions, both clinical and calculation-based, with p-values above 0.05 indicating no significant difference between the prompt types. The conclusion drawn from this study is that the use of CoT prompt engineering does not significantly alter GPT-3.5's effectiveness in handling medical calculations or clinical scenario questions styled like those in USMLE exams. This finding is crucial as it suggests that performance of ChatGPT remains consistent regardless of whether a CoT technique is used instead of direct prompts. This consistency could be instrumental in simplifying the integration of AI tools like ChatGPT into medical education, enabling healthcare professionals to utilize these tools with ease, without the necessity for complex prompt engineering.
Collapse
|
2
|
Huerta N, Rao SJ, Isath A, Wang Z, Glicksberg BS, Krittanawong C. The premise, promise, and perils of artificial intelligence in critical care cardiology. Prog Cardiovasc Dis 2024:S0033-0620(24)00094-X. [PMID: 38936757 DOI: 10.1016/j.pcad.2024.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 06/23/2024] [Indexed: 06/29/2024]
Abstract
Artificial intelligence (AI) is an emerging technology with numerous healthcare applications. AI could prove particularly useful in the cardiac intensive care unit (CICU) where its capacity to analyze large datasets in real-time would assist clinicians in making more informed decisions. This systematic review aimed to explore current research on AI as it pertains to the CICU. A PRISMA search strategy was carried out to identify the pertinent literature on topics including vascular access, heart failure care, circulatory support, cardiogenic shock, ultrasound, and mechanical ventilation. Thirty-eight studies were included. Although AI is still in its early stages of development, this review illustrates its potential to yield numerous benefits in the CICU.
Collapse
|
3
|
Gleason A, Richter F, Beller N, Arivazhagan N, Feng R, Holmes E, Glicksberg BS, Morton SU, La Vega-Talbott M, Fields M, Guttmann K, Nadkarni GN, Richter F. Accurate prediction of neurologic changes in critically ill infants using pose AI. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.17.24305953. [PMID: 38699362 PMCID: PMC11064996 DOI: 10.1101/2024.04.17.24305953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Infant alertness and neurologic changes can reflect life-threatening pathology but are assessed by exam, which can be intermittent and subjective. Reliable, continuous methods are needed. We hypothesized that our computer vision method to track movement, pose AI, could predict neurologic changes in the neonatal intensive care unit (NICU). We collected 4,705 hours of video linked to electroencephalograms (EEG) from 115 infants. We trained a deep learning pose algorithm that accurately predicted anatomic landmarks in three evaluation sets (ROC-AUCs 0.83-0.94), showing feasibility of applying pose AI in an ICU. We then trained classifiers on landmarks from pose AI and observed high performance for sedation (ROC-AUCs 0.87-0.91) and cerebral dysfunction (ROC-AUCs 0.76-0.91), demonstrating that an EEG diagnosis can be predicted from video data alone. Taken together, deep learning with pose AI may offer a scalable, minimally invasive method for neuro-telemetry in the NICU.
Collapse
|
4
|
Rao SJ, Isath A, Krishnan P, Tangsrivimol JA, Virk HUH, Wang Z, Glicksberg BS, Krittanawong C. ChatGPT: A Conceptual Review of Applications and Utility in the Field of Medicine. J Med Syst 2024; 48:59. [PMID: 38836893 DOI: 10.1007/s10916-024-02075-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 05/07/2024] [Indexed: 06/06/2024]
Abstract
Artificial Intelligence, specifically advanced language models such as ChatGPT, have the potential to revolutionize various aspects of healthcare, medical education, and research. In this narrative review, we evaluate the myriad applications of ChatGPT in diverse healthcare domains. We discuss its potential role in clinical decision-making, exploring how it can assist physicians by providing rapid, data-driven insights for diagnosis and treatment. We review the benefits of ChatGPT in personalized patient care, particularly in geriatric care, medication management, weight loss and nutrition, and physical activity guidance. We further delve into its potential to enhance medical research, through the analysis of large datasets, and the development of novel methodologies. In the realm of medical education, we investigate the utility of ChatGPT as an information retrieval tool and personalized learning resource for medical students and professionals. There are numerous promising applications of ChatGPT that will likely induce paradigm shifts in healthcare practice, education, and research. The use of ChatGPT may come with several benefits in areas such as clinical decision making, geriatric care, medication management, weight loss and nutrition, physical fitness, scientific research, and medical education. Nevertheless, it is important to note that issues surrounding ethics, data privacy, transparency, inaccuracy, and inadequacy persist. Prior to widespread use in medicine, it is imperative to objectively evaluate the impact of ChatGPT in a real-world setting using a risk-based approach.
Collapse
|
5
|
Glicksberg BS, Timsina P, Patel D, Sawant A, Vaid A, Raut G, Charney AW, Apakama D, Carr BG, Freeman R, Nadkarni GN, Klang E. Evaluating the accuracy of a state-of-the-art large language model for prediction of admissions from the emergency room. J Am Med Inform Assoc 2024:ocae103. [PMID: 38771093 DOI: 10.1093/jamia/ocae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 04/22/2024] [Indexed: 05/22/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI) and large language models (LLMs) can play a critical role in emergency room operations by augmenting decision-making about patient admission. However, there are no studies for LLMs using real-world data and scenarios, in comparison to and being informed by traditional supervised machine learning (ML) models. We evaluated the performance of GPT-4 for predicting patient admissions from emergency department (ED) visits. We compared performance to traditional ML models both naively and when informed by few-shot examples and/or numerical probabilities. METHODS We conducted a retrospective study using electronic health records across 7 NYC hospitals. We trained Bio-Clinical-BERT and XGBoost (XGB) models on unstructured and structured data, respectively, and created an ensemble model reflecting ML performance. We then assessed GPT-4 capabilities in many scenarios: through Zero-shot, Few-shot with and without retrieval-augmented generation (RAG), and with and without ML numerical probabilities. RESULTS The Ensemble ML model achieved an area under the receiver operating characteristic curve (AUC) of 0.88, an area under the precision-recall curve (AUPRC) of 0.72 and an accuracy of 82.9%. The naïve GPT-4's performance (0.79 AUC, 0.48 AUPRC, and 77.5% accuracy) showed substantial improvement when given limited, relevant data to learn from (ie, RAG) and underlying ML probabilities (0.87 AUC, 0.71 AUPRC, and 83.1% accuracy). Interestingly, RAG alone boosted performance to near peak levels (0.82 AUC, 0.56 AUPRC, and 81.3% accuracy). CONCLUSIONS The naïve LLM had limited performance but showed significant improvement in predicting ED admissions when supplemented with real-world examples to learn from, particularly through RAG, and/or numerical probabilities from traditional ML models. Its peak performance, although slightly lower than the pure ML model, is noteworthy given its potential for providing reasoning behind predictions. Further refinement of LLMs with real-world data is necessary for successful integration as decision-support tools in care settings.
Collapse
|
6
|
El Sherbini A, Rosenson RS, Al Rifai M, Virk HUH, Wang Z, Virani S, Glicksberg BS, Lavie CJ, Krittanawong C. Artificial intelligence in preventive cardiology. Prog Cardiovasc Dis 2024; 84:76-89. [PMID: 38460897 DOI: 10.1016/j.pcad.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Accepted: 03/03/2024] [Indexed: 03/11/2024]
Abstract
Artificial intelligence (AI) is a field of study that strives to replicate aspects of human intelligence into machines. Preventive cardiology, a subspeciality of cardiovascular (CV) medicine, aims to target and mitigate known risk factors for CV disease (CVD). AI's integration into preventive cardiology may introduce novel treatment interventions and AI-centered clinician assistive tools to reduce the risk of CVD. AI's role in nutrition, weight loss, physical activity, sleep hygiene, blood pressure, dyslipidemia, smoking, alcohol, recreational drugs, and mental health has been investigated. AI has immense potential to be used for the screening, detection, and monitoring of the mentioned risk factors. However, the current literature must be supplemented with future clinical trials to evaluate the capabilities of AI interventions for preventive cardiology. This review discusses present examples, potentials, and limitations of AI's role for the primary and secondary prevention of CVD.
Collapse
|
7
|
Artsi Y, Sorin V, Konen E, Glicksberg BS, Nadkarni G, Klang E. Large language models for generating medical examinations: systematic review. BMC MEDICAL EDUCATION 2024; 24:354. [PMID: 38553693 PMCID: PMC10981304 DOI: 10.1186/s12909-024-05239-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 02/28/2024] [Indexed: 04/01/2024]
Abstract
BACKGROUND Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. METHODS The authors searched for studies published up to November 2023. Search terms focused on LLMs generated MCQs for medical examinations. Non-English, out of year range and studies not focusing on AI generated multiple-choice questions were excluded. MEDLINE was used as a search database. Risk of bias was evaluated using a tailored QUADAS-2 tool. RESULTS Overall, eight studies published between April 2023 and October 2023 were included. Six studies used Chat-GPT 3.5, while two employed GPT 4. Five studies showed that LLMs can produce competent questions valid for medical exams. Three studies used LLMs to write medical questions but did not evaluate the validity of the questions. One study conducted a comparative analysis of different models. One other study compared LLM-generated questions with those written by humans. All studies presented faulty questions that were deemed inappropriate for medical exams. Some questions required additional modifications in order to qualify. CONCLUSIONS LLMs can be used to write MCQs for medical examinations. However, their limitations cannot be ignored. Further study in this field is essential and more conclusive evidence is needed. Until then, LLMs may serve as a supplementary tool for writing medical examinations. 2 studies were at high risk of bias. The study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
Collapse
|
8
|
Sorin V, Glicksberg BS, Artsi Y, Barash Y, Konen E, Nadkarni GN, Klang E. Utilizing large language models in breast cancer management: systematic review. J Cancer Res Clin Oncol 2024; 150:140. [PMID: 38504034 PMCID: PMC10950983 DOI: 10.1007/s00432-024-05678-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 03/01/2024] [Indexed: 03/21/2024]
Abstract
PURPOSE Despite advanced technologies in breast cancer management, challenges remain in efficiently interpreting vast clinical data for patient-specific insights. We reviewed the literature on how large language models (LLMs) such as ChatGPT might offer solutions in this field. METHODS We searched MEDLINE for relevant studies published before December 22, 2023. Keywords included: "large language models", "LLM", "GPT", "ChatGPT", "OpenAI", and "breast". The risk bias was evaluated using the QUADAS-2 tool. RESULTS Six studies evaluating either ChatGPT-3.5 or GPT-4, met our inclusion criteria. They explored clinical notes analysis, guideline-based question-answering, and patient management recommendations. Accuracy varied between studies, ranging from 50 to 98%. Higher accuracy was seen in structured tasks like information retrieval. Half of the studies used real patient data, adding practical clinical value. Challenges included inconsistent accuracy, dependency on the way questions are posed (prompt-dependency), and in some cases, missing critical clinical information. CONCLUSION LLMs hold potential in breast cancer care, especially in textual information extraction and guideline-driven clinical question-answering. Yet, their inconsistent accuracy underscores the need for careful validation of these models, and the importance of ongoing supervision.
Collapse
|
9
|
Oh W, Jayaraman P, Tandon P, Chaddha US, Kovatch P, Charney AW, Glicksberg BS, Nadkarni GN. A novel method leveraging time series data to improve subphenotyping and application in critically ill patients with COVID-19. Artif Intell Med 2024; 148:102750. [PMID: 38325922 PMCID: PMC10864255 DOI: 10.1016/j.artmed.2023.102750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 12/12/2023] [Accepted: 12/14/2023] [Indexed: 02/09/2024]
Abstract
Computational subphenotyping, a data-driven approach to understanding disease subtypes, is a prominent topic in medical research. Numerous ongoing studies are dedicated to developing advanced computational subphenotyping methods for cross-sectional data. However, the potential of time-series data has been underexplored until now. Here, we propose a Multivariate Levenshtein Distance (MLD) that can account for address correlation in multiple discrete features over time-series data. Our algorithm has two distinct components: it integrates an optimal threshold score to enhance the sensitivity in discriminating between pairs of instances, and the MLD itself. We have applied the proposed distance metrics on the k-means clustering algorithm to derive temporal subphenotypes from time-series data of biomarkers and treatment administrations from 1039 critically ill patients with COVID-19 and compare its effectiveness to standard methods. In conclusion, the Multivariate Levenshtein Distance metric is a novel method to quantify the distance from multiple discrete features over time-series data and demonstrates superior clustering performance among competing time-series distance metrics.
Collapse
|
10
|
Zang C, Zhang H, Xu J, Zhang H, Fouladvand S, Havaldar S, Cheng F, Chen K, Chen Y, Glicksberg BS, Chen J, Bian J, Wang F. High-throughput target trial emulation for Alzheimer's disease drug repurposing with real-world data. Nat Commun 2023; 14:8180. [PMID: 38081829 PMCID: PMC10713627 DOI: 10.1038/s41467-023-43929-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 11/24/2023] [Indexed: 12/18/2023] Open
Abstract
Target trial emulation is the process of mimicking target randomized trials using real-world data, where effective confounding control for unbiased treatment effect estimation remains a main challenge. Although various approaches have been proposed for this challenge, a systematic evaluation is still lacking. Here we emulated trials for thousands of medications from two large-scale real-world data warehouses, covering over 10 years of clinical records for over 170 million patients, aiming to identify new indications of approved drugs for Alzheimer's disease. We assessed different propensity score models under the inverse probability of treatment weighting framework and suggested a model selection strategy for improved baseline covariate balancing. We also found that the deep learning-based propensity score model did not necessarily outperform logistic regression-based methods in covariate balancing. Finally, we highlighted five top-ranked drugs (pantoprazole, gabapentin, atorvastatin, fluticasone, and omeprazole) originally intended for other indications with potential benefits for Alzheimer's patients.
Collapse
|
11
|
Tessler I, Gecel NA, Glicksberg BS, Shivatzki S, Shapira Y, Zimlichman E, Alon EE, Klang E, Wolfovitz A. A Five-Decade Text Mining Analysis of Cochlear Implant Research: Where We Started and Where We Are Heading. MEDICINA (KAUNAS, LITHUANIA) 2023; 59:1891. [PMID: 38003940 PMCID: PMC10673015 DOI: 10.3390/medicina59111891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/09/2023] [Accepted: 10/19/2023] [Indexed: 11/26/2023]
Abstract
Background and Objectives: Since its invention in the 1970s, the cochlear implant (CI) has been substantially developed. We aimed to assess the trends in the published literature to characterize CI. Materials and Methods: We queried PubMed for all CI-related entries published during 1970-2022. The following data were extracted: year of publication, publishing journal, title, keywords, and abstract text. Search terms belonged to the patient's age group, etiology for hearing loss, indications for CI, and surgical methodological advancement. Annual trends of publications were plotted. The slopes of publication trends were calculated by fitting regression lines to the yearly number of publications. Results: Overall, 19,428 CIs articles were identified. Pediatric-related CI was the most dominant sub-population among the age groups, with the highest rate and slope during the years (slope 5.2 ± 0.3, p < 0.001), while elderly-related CIs had significantly fewer publications. Entries concerning hearing preservation showed the sharpest rise among the methods, from no entries in 1980 to 46 entries in 2021 (slope 1.7 ± 0.2, p < 0.001). Entries concerning robotic surgery emerged in 2000, with a sharp increase in recent years (slope 0.5 ± 0.1, p < 0.001). Drug-eluting electrodes and CI under local-anesthesia have been reported only in the past five years, with a gradual rise. Conclusions: Publications regarding CI among pediatrics outnumbered all other indications, supporting the rising, pivotal role of CI in the rehabilitation of children with sensorineural hearing loss. Hearing-preservation publications have recently rapidly risen, identified as the primary trend of the current era, followed by a sharp rise of robotic surgery that is evolving and could define the next revolution.
Collapse
|
12
|
Brin D, Sorin V, Vaid A, Soroush A, Glicksberg BS, Charney AW, Nadkarni G, Klang E. Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments. Sci Rep 2023; 13:16492. [PMID: 37779171 PMCID: PMC10543445 DOI: 10.1038/s41598-023-43436-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 09/23/2023] [Indexed: 10/03/2023] Open
Abstract
The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models' consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT's 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.
Collapse
|
13
|
Sorin V, Soffer S, Glicksberg BS, Barash Y, Konen E, Klang E. Adversarial attacks in radiology - A systematic review. Eur J Radiol 2023; 167:111085. [PMID: 37699278 DOI: 10.1016/j.ejrad.2023.111085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 08/04/2023] [Accepted: 09/04/2023] [Indexed: 09/14/2023]
Abstract
PURPOSE The growing application of deep learning in radiology has raised concerns about cybersecurity, particularly in relation to adversarial attacks. This study aims to systematically review the literature on adversarial attacks in radiology. METHODS We searched for studies on adversarial attacks in radiology published up to April 2023, using MEDLINE and Google Scholar databases. RESULTS A total of 22 studies published between March 2018 and April 2023 were included, primarily focused on image classification algorithms. Fourteen studies evaluated white-box attacks, three assessed black-box attacks and five investigated both. Eleven of the 22 studies targeted chest X-ray classification algorithms, while others involved chest CT (6/22), brain MRI (4/22), mammography (2/22), abdominal CT (1/22), hepatic US (1/22), and thyroid US (1/22). Some attacks proved highly effective, reducing the AUC of algorithm performance to 0 and achieving success rates up to 100 %. CONCLUSIONS Adversarial attacks are a growing concern. Although currently the threats are more theoretical than practical, they still represent a potential risk. It is important to be alert to such attacks, reinforce cybersecurity measures, and influence the formulation of ethical and legal guidelines. This will ensure the safe use of deep learning technology in medicine.
Collapse
|
14
|
Yeung AWK, Torkamani A, Butte AJ, Glicksberg BS, Schuller B, Rodriguez B, Ting DSW, Bates D, Schaden E, Peng H, Willschke H, van der Laak J, Car J, Rahimi K, Celi LA, Banach M, Kletecka-Pulker M, Kimberger O, Eils R, Islam SMS, Wong ST, Wong TY, Gao W, Brunak S, Atanasov AG. The promise of digital healthcare technologies. Front Public Health 2023; 11:1196596. [PMID: 37822534 PMCID: PMC10562722 DOI: 10.3389/fpubh.2023.1196596] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 09/04/2023] [Indexed: 10/13/2023] Open
Abstract
Digital health technologies have been in use for many years in a wide spectrum of healthcare scenarios. This narrative review outlines the current use and the future strategies and significance of digital health technologies in modern healthcare applications. It covers the current state of the scientific field (delineating major strengths, limitations, and applications) and envisions the future impact of relevant emerging key technologies. Furthermore, we attempt to provide recommendations for innovative approaches that would accelerate and benefit the research, translation and utilization of digital health technologies.
Collapse
|
15
|
He Z, Zhang R, Diallo G, Huang Z, Glicksberg BS. Editorial: Explainable artificial intelligence for critical healthcare applications. Front Artif Intell 2023; 6:1282800. [PMID: 37771610 PMCID: PMC10523392 DOI: 10.3389/frai.2023.1282800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 08/28/2023] [Indexed: 09/30/2023] Open
|
16
|
Paranjpe I, Wang X, Anandakrishnan N, Haydak JC, Van Vleck T, DeFronzo S, Li Z, Mendoza A, Liu R, Fu J, Forrest I, Zhou W, Lee K, O'Hagan R, Dellepiane S, Menon KM, Gulamali F, Kamat S, Gusella GL, Charney AW, Hofer I, Cho JH, Do R, Glicksberg BS, He JC, Nadkarni GN, Azeloglu EU. Deep learning on electronic medical records identifies distinct subphenotypes of diabetic kidney disease driven by genetic variations in the Rho pathway. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.06.23295120. [PMID: 37732187 PMCID: PMC10508814 DOI: 10.1101/2023.09.06.23295120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
Kidney disease affects 50% of all diabetic patients; however, prediction of disease progression has been challenging due to inherent disease heterogeneity. We use deep learning to identify novel genetic signatures prognostically associated with outcomes. Using autoencoders and unsupervised clustering of electronic health record data on 1,372 diabetic kidney disease patients, we establish two clusters with differential prevalence of end-stage kidney disease. Exome-wide associations identify a novel variant in ARHGEF18, a Rho guanine exchange factor specifically expressed in glomeruli. Overexpression of ARHGEF18 in human podocytes leads to impairments in focal adhesion architecture, cytoskeletal dynamics, cellular motility, and RhoA/Rac1 activation. Mutant GEF18 is resistant to ubiquitin mediated degradation leading to pathologically increased protein levels. Our findings uncover the first known disease-causing genetic variant that affects protein stability of a cytoskeletal regulator through impaired degradation, a potentially novel class of expression quantitative trait loci that can be therapeutically targeted.
Collapse
|
17
|
Perez Garcia G, Bicak M, Buros J, Haure-Mirande JV, Perez GM, Otero-Pagan A, Gama Sosa MA, De Gasperi R, Sano M, Gage FH, Barlow C, Dudley JT, Glicksberg BS, Wang Y, Readhead B, Ehrlich ME, Elder GA, Gandy S. Beneficial effects of physical exercise and an orally active mGluR2/3 antagonist pro-drug on neurogenesis and behavior in an Alzheimer's amyloidosis model. FRONTIERS IN DEMENTIA 2023; 2:1198006. [PMID: 39081972 PMCID: PMC11285632 DOI: 10.3389/frdem.2023.1198006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 07/31/2023] [Indexed: 08/02/2024]
Abstract
Background Modulation of physical activity represents an important intervention that may delay, slow, or prevent mild cognitive impairment (MCI) or dementia due to Alzheimer's disease (AD). One mechanism proposed to underlie the beneficial effect of physical exercise (PE) involves the apparent stimulation of adult hippocampal neurogenesis (AHN). BCI-838 is a pro-drug whose active metabolite BCI-632 is a negative allosteric modulator at group II metabotropic glutamate receptors (mGluR2/3). We previously demonstrated that administration of BCI-838 to a mouse model of brain accumulation of oligomeric AβE22Q (APP E693Q = "Dutch APP") reduced learning behavior impairment and anxiety, both of which are associated with the phenotype of Dutch APP mice. Methods 3-month-old mice were administered BCI-838 and/or physical exercise for 1 month and then tested in novel object recognition, neurogenesis, and RNAseq. Results Here we show that (i) administration of BCI-838 and a combination of BCI-838 and PE enhanced AHN in a 4-month old mouse model of AD amyloid pathology (APP KM670/671NL /PSEN1 Δexon9= APP/PS1), (ii) administration of BCI-838 alone or with PE led to stimulation of AHN and improvement in recognition memory, (iii) the hippocampal dentate gyrus transcriptome of APP/PS1 mice following BCI-838 treatment showed up-regulation of brain-derived neurotrophic factor (BDNF), PIK3C2A of the PI3K-mTOR pathway, and metabotropic glutamate receptors, and down-regulation of EIF5A involved in modulation of mTOR activity by ketamine, and (iv) validation by qPCR of an association between increased BDNF levels and BCI-838 treatment. Conclusion Our study points to BCI-838 as a safe and orally active compound capable of mimicking the beneficial effect of PE on AHN and recognition memory in a mouse model of AD amyloid pathology.
Collapse
|
18
|
Tangsrivimol JA, Schonfeld E, Zhang M, Veeravagu A, Smith TR, Härtl R, Lawton MT, El-Sherbini AH, Prevedello DM, Glicksberg BS, Krittanawong C. Artificial Intelligence in Neurosurgery: A State-of-the-Art Review from Past to Future. Diagnostics (Basel) 2023; 13:2429. [PMID: 37510174 PMCID: PMC10378231 DOI: 10.3390/diagnostics13142429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 07/06/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
In recent years, there has been a significant surge in discussions surrounding artificial intelligence (AI), along with a corresponding increase in its practical applications in various facets of everyday life, including the medical industry. Notably, even in the highly specialized realm of neurosurgery, AI has been utilized for differential diagnosis, pre-operative evaluation, and improving surgical precision. Many of these applications have begun to mitigate risks of intraoperative and postoperative complications and post-operative care. This article aims to present an overview of the principal published papers on the significant themes of tumor, spine, epilepsy, and vascular issues, wherein AI has been applied to assess its potential applications within neurosurgery. The method involved identifying high-cited seminal papers using PubMed and Google Scholar, conducting a comprehensive review of various study types, and summarizing machine learning applications to enhance understanding among clinicians for future utilization. Recent studies demonstrate that machine learning (ML) holds significant potential in neuro-oncological care, spine surgery, epilepsy management, and other neurosurgical applications. ML techniques have proven effective in tumor identification, surgical outcomes prediction, seizure outcome prediction, aneurysm prediction, and more, highlighting its broad impact and potential in improving patient management and outcomes in neurosurgery. This review will encompass the current state of research, as well as predictions for the future of AI within neurosurgery.
Collapse
|
19
|
Paranjpe I, Jayaraman P, Su CY, Zhou S, Chen S, Thompson R, Del Valle DM, Kenigsberg E, Zhao S, Jaladanki S, Chaudhary K, Ascolillo S, Vaid A, Gonzalez-Kozlova E, Kauffman J, Kumar A, Paranjpe M, Hagan RO, Kamat S, Gulamali FF, Xie H, Harris J, Patel M, Argueta K, Batchelor C, Nie K, Dellepiane S, Scott L, Levin MA, He JC, Suarez-Farinas M, Coca SG, Chan L, Azeloglu EU, Schadt E, Beckmann N, Gnjatic S, Merad M, Kim-Schulze S, Richards B, Glicksberg BS, Charney AW, Nadkarni GN. Proteomic characterization of acute kidney injury in patients hospitalized with SARS-CoV2 infection. COMMUNICATIONS MEDICINE 2023; 3:81. [PMID: 37308534 PMCID: PMC10258469 DOI: 10.1038/s43856-023-00307-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 05/18/2023] [Indexed: 06/14/2023] Open
Abstract
BACKGROUND Acute kidney injury (AKI) is a known complication of COVID-19 and is associated with an increased risk of in-hospital mortality. Unbiased proteomics using biological specimens can lead to improved risk stratification and discover pathophysiological mechanisms. METHODS Using measurements of ~4000 plasma proteins in two cohorts of patients hospitalized with COVID-19, we discovered and validated markers of COVID-associated AKI (stage 2 or 3) and long-term kidney dysfunction. In the discovery cohort (N = 437), we identified 413 higher plasma abundances of protein targets and 30 lower plasma abundances of protein targets associated with COVID-AKI (adjusted p < 0.05). Of these, 62 proteins were validated in an external cohort (p < 0.05, N = 261). RESULTS We demonstrate that COVID-AKI is associated with increased markers of tubular injury (NGAL) and myocardial injury. Using estimated glomerular filtration (eGFR) measurements taken after discharge, we also find that 25 of the 62 AKI-associated proteins are significantly associated with decreased post-discharge eGFR (adjusted p < 0.05). Proteins most strongly associated with decreased post-discharge eGFR included desmocollin-2, trefoil factor 3, transmembrane emp24 domain-containing protein 10, and cystatin-C indicating tubular dysfunction and injury. CONCLUSIONS Using clinical and proteomic data, our results suggest that while both acute and long-term COVID-associated kidney dysfunction are associated with markers of tubular dysfunction, AKI is driven by a largely multifactorial process involving hemodynamic instability and myocardial damage.
Collapse
|
20
|
Menon KM, Das S, Shervey M, Johnson M, Glicksberg BS, Levin MA. Automated electrocardiogram signal quality assessment based on Fourier analysis and template matching. J Clin Monit Comput 2023; 37:829-837. [PMID: 36464761 PMCID: PMC9734499 DOI: 10.1007/s10877-022-00948-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 11/10/2022] [Indexed: 12/12/2022]
Abstract
We developed and tested a novel template matching approach for signal quality assessment on electrocardiogram (ECG) data. A computational method was developed that uses a sinusoidal approximation to the QRS complex to generate a correlation value at every point of an ECG. The strength of this correlation can be numerically adapted into a 'score' for each segment of an ECG, which can be used to stratify signal quality. The algorithm was tested on lead II ECGs of intensive care unit (ICU) patients admitted to the Mount Sinai Hospital (MSH) from January to July 2020 and on records from the MIT BIH arrhythmia database. The algorithm was found to be 98.9% specific and 99% sensitive on test data from the MSH ICU patients. The routine performs in linear O(n) time and occupies O(1) heap space in runtime. This approach can be used to lower the burden of pre-processing in ECG signal analysis. Given its runtime (O(n)) and memory (O(1)) complexity, there are potential applications for signal quality stratification and arrhythmia detection in wearable devices or smartphones.
Collapse
|
21
|
Krittanawong C, Omar AMS, Narula S, Sengupta PP, Glicksberg BS, Narula J, Argulian E. Deep Learning for Echocardiography: Introduction for Clinicians and Future Vision: State-of-the-Art Review. Life (Basel) 2023; 13:life13041029. [PMID: 37109558 PMCID: PMC10145844 DOI: 10.3390/life13041029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 03/30/2023] [Accepted: 04/03/2023] [Indexed: 04/29/2023] Open
Abstract
Exponential growth in data storage and computational power is rapidly narrowing the gap between translating findings from advanced clinical informatics into cardiovascular clinical practice. Specifically, cardiovascular imaging has the distinct advantage in providing a great quantity of data for potentially rich insights, but nuanced interpretation requires a high-level skillset that few individuals possess. A subset of machine learning, deep learning (DL), is a modality that has shown promise, particularly in the areas of image recognition, computer vision, and video classification. Due to a low signal-to-noise ratio, echocardiographic data tend to be challenging to classify; however, utilization of robust DL architectures may help clinicians and researchers automate conventional human tasks and catalyze the extraction of clinically useful data from the petabytes of collected imaging data. The promise is extending far and beyond towards a contactless echocardiographic exam-a dream that is much needed in this time of uncertainty and social distancing brought on by a stunning pandemic culture. In the current review, we discuss state-of-the-art DL techniques and architectures that can be used for image and video classification, and future directions in echocardiographic research in the current era.
Collapse
|
22
|
Ryu E, Jenkins GD, Wang Y, Olfson M, Talati A, Lepow L, Coombes BJ, Charney AW, Glicksberg BS, Mann JJ, Weissman MM, Wickramaratne P, Pathak J, Biernacka JM. The importance of social activity to risk of major depression in older adults. Psychol Med 2023; 53:2634-2642. [PMID: 34763736 PMCID: PMC9095757 DOI: 10.1017/s0033291721004566] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 10/04/2021] [Accepted: 10/20/2021] [Indexed: 11/07/2022]
Abstract
BACKGROUND Several social determinants of health (SDoH) have been associated with the onset of major depressive disorder (MDD). However, prior studies largely focused on individual SDoH and thus less is known about the relative importance (RI) of SDoH variables, especially in older adults. Given that risk factors for MDD may differ across the lifespan, we aimed to identify the SDoH that was most strongly related to newly diagnosed MDD in a cohort of older adults. METHODS We used self-reported health-related survey data from 41 174 older adults (50-89 years, median age = 67 years) who participated in the Mayo Clinic Biobank, and linked ICD codes for MDD in the participants' electronic health records. Participants with a history of clinically documented or self-reported MDD prior to survey completion were excluded from analysis (N = 10 938, 27%). We used Cox proportional hazards models with a gradient boosting machine approach to quantify the RI of 30 pre-selected SDoH variables on the risk of future MDD diagnosis. RESULTS Following biobank enrollment, 2073 older participants were diagnosed with MDD during the follow-up period (median duration = 6.7 years). The most influential SDoH was perceived level of social activity (RI = 0.17). Lower level of social activity was associated with a higher risk of MDD [hazard ratio = 2.27 (95% CI 2.00-2.50) for highest v. lowest level]. CONCLUSION Across a range of SDoH variables, perceived level of social activity is most strongly related to MDD in older adults. Monitoring changes in the level of social activity may help identify older adults at an increased risk of MDD.
Collapse
|
23
|
Sanders LM, Scott RT, Yang JH, Qutub AA, Garcia Martin H, Berrios DC, Hastings JJA, Rask J, Mackintosh G, Hoarfrost AL, Chalk S, Kalantari J, Khezeli K, Antonsen EL, Babdor J, Barker R, Baranzini SE, Beheshti A, Delgado-Aparicio GM, Glicksberg BS, Greene CS, Haendel M, Hamid AA, Heller P, Jamieson D, Jarvis KJ, Komarova SV, Komorowski M, Kothiyal P, Mahabal A, Manor U, Mason CE, Matar M, Mias GI, Miller J, Myers JG, Nelson C, Oribello J, Park SM, Parsons-Wingerter P, Prabhu RK, Reynolds RJ, Saravia-Butler A, Saria S, Sawyer A, Singh NK, Snyder M, Soboczenski F, Soman K, Theriot CA, Van Valen D, Venkateswaran K, Warren L, Worthey L, Zitnik M, Costes SV. Biological research and self-driving labs in deep space supported by artificial intelligence. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00618-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
24
|
Scott RT, Sanders LM, Antonsen EL, Hastings JJA, Park SM, Mackintosh G, Reynolds RJ, Hoarfrost AL, Sawyer A, Greene CS, Glicksberg BS, Theriot CA, Berrios DC, Miller J, Babdor J, Barker R, Baranzini SE, Beheshti A, Chalk S, Delgado-Aparicio GM, Haendel M, Hamid AA, Heller P, Jamieson D, Jarvis KJ, Kalantari J, Khezeli K, Komarova SV, Komorowski M, Kothiyal P, Mahabal A, Manor U, Garcia Martin H, Mason CE, Matar M, Mias GI, Myers JG, Nelson C, Oribello J, Parsons-Wingerter P, Prabhu RK, Qutub AA, Rask J, Saravia-Butler A, Saria S, Singh NK, Snyder M, Soboczenski F, Soman K, Van Valen D, Venkateswaran K, Warren L, Worthey L, Yang JH, Zitnik M, Costes SV. Biomonitoring and precision health in deep space supported by artificial intelligence. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00617-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
25
|
Vaid A, Argulian E, Lerakis S, Beaulieu-Jones BK, Krittanawong C, Klang E, Lampert J, Reddy VY, Narula J, Nadkarni GN, Glicksberg BS. Multi-center retrospective cohort study applying deep learning to electrocardiograms to identify left heart valvular dysfunction. COMMUNICATIONS MEDICINE 2023; 3:24. [PMID: 36788316 PMCID: PMC9929085 DOI: 10.1038/s43856-023-00240-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 01/09/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND Aortic Stenosis and Mitral Regurgitation are common valvular conditions representing a hidden burden of disease within the population. The aim of this study was to develop and validate deep learning-based screening and diagnostic tools that can help guide clinical decision making. METHODS In this multi-center retrospective cohort study, we acquired Transthoracic Echocardiogram reports from five Mount Sinai hospitals within New York City representing a demographically diverse cohort of patients. We developed a Natural Language Processing pipeline to extract ground-truth labels about valvular status and paired these to Electrocardiograms (ECGs). We developed and externally validated deep learning models capable of detecting valvular disease, in addition to considering scenarios of clinical deployment. RESULTS We use 617,338 ECGs paired to transthoracic echocardiograms from 123,096 patients to develop a deep learning model for detection of Mitral Regurgitation. Area Under Receiver Operating Characteristic curve (AUROC) is 0.88 (95% CI:0.88-0.89) in internal testing, and 0.81 (95% CI:0.80-0.82) in external validation. To develop a model for detection of Aortic Stenosis, we use 617,338 Echo-ECG pairs for 128,628 patients. AUROC is 0.89 (95% CI: 0.88-0.89) in internal testing, going to 0.86 (95% CI: 0.85-0.87) in external validation. The model's performance increases leading up to the time of the diagnostic echo, and it performs well in validation against requirement of Transcatheter Aortic Valve Replacement procedures. CONCLUSIONS Deep learning based tools can increase the amount of information extracted from ubiquitous investigations such as the ECG. Such tools are inexpensive, can help in earlier disease detection, and potentially improve prognosis.
Collapse
|