1
|
Thapa R, Iqbal Z, Garikipati A, Siefkas A, Hoffman J, Mao Q, Das R. Early prediction of severe acute pancreatitis using machine learning. Pancreatology 2022; 22:43-50. [PMID: 34690046 DOI: 10.1016/j.pan.2021.10.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 09/27/2021] [Accepted: 10/12/2021] [Indexed: 12/11/2022]
Abstract
BACKGROUND Acute pancreatitis (AP) is one of the most common causes of gastrointestinal-related hospitalizations in the United States. Severe AP (SAP) is associated with a mortality rate of nearly 30% and is distinguished from milder forms of AP. Risk stratification to identify SAP cases needing inpatient treatment is an important aspect of AP diagnosis. METHODS We developed machine learning algorithms to predict which patients presenting with AP would require treatment for SAP. Three models were developed using logistic regression, neural networks, and XGBoost. Models were assessed in terms of area under the receiver operating characteristic (AUROC) and compared to the Harmless Acute Pancreatitis Score (HAPS) and Bedside Index for Severity in Acute Pancreatitis (BISAP) scores for AP risk stratification. RESULTS 61,894 patients were used to train and test the machine learning models. With an AUROC value of 0.921, the model developed using XGBoost outperformed the logistic regression and neural network-based models. The XGBoost model also achieved a higher AUROC than both HAPS and BISAP for identifying patients who would be diagnosed with SAP. CONCLUSIONS Machine learning may be able to improve the accuracy of AP risk stratification methods and allow for more timely treatment and initiation of interventions.
Collapse
|
|
3 |
16 |
2
|
Rahmani K, Garikipati A, Barnes G, Hoffman J, Calvert J, Mao Q, Das R. Early prediction of central line associated bloodstream infection using machine learning. Am J Infect Control 2022; 50:440-445. [PMID: 34428529 DOI: 10.1016/j.ajic.2021.08.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/16/2021] [Accepted: 08/17/2021] [Indexed: 11/01/2022]
Abstract
BACKGROUND Central line-associated bloodstream infections (CLABSIs) are associated with significant morbidity, mortality, and increased healthcare costs. Despite the high prevalence of CLABSIs in the U.S., there are currently no tools to stratify a patient's risk of developing an infection as the result of central line placement. To this end, we have developed and validated a machine learning algorithm (MLA) that can predict a patient's likelihood of developing CLABSI using only electronic health record data in order to provide clinical decision support. METHODS We created three machine learning models to retrospectively analyze electronic health record data from 27,619 patient encounters. The models were trained and validated using an 80:20 split for the train and test data. Patients designated as having a central line procedure based on International Statistical Classification of Diseases and Related Health Problems 10 codes were included. RESULTS XGBoost was the highest performing MLA out of the three models, obtaining an AUROC of 0.762 for CLABSI risk prediction at 48 hours after the recorded time for central line placement. CONCLUSIONS Our results demonstrate that MLAs may be effective clinical decision support tools for assessment of CLABSI risk and should be explored further for this purpose.
Collapse
|
|
3 |
12 |
3
|
Radhachandran A, Garikipati A, Iqbal Z, Siefkas A, Barnes G, Hoffman J, Mao Q, Das R. A machine learning approach to predicting risk of myelodysplastic syndrome. Leuk Res 2021; 109:106639. [PMID: 34171604 DOI: 10.1016/j.leukres.2021.106639] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/18/2021] [Accepted: 06/05/2021] [Indexed: 10/21/2022]
Abstract
BACKGROUND Early myelodysplastic syndrome (MDS) diagnosis can allow physicians to provide early treatment, which may delay advancement of MDS and improve quality of life. However, MDS often goes unrecognized and is difficult to distinguish from other disorders. We developed a machine learning algorithm for the prediction of MDS one year prior to clinical diagnosis of the disease. METHODS Retrospective analysis was performed on 790,470 patients over the age of 45 seen in the United States between 2007 and 2020. A gradient boosted decision tree model (XGB) was built to predict MDS diagnosis using vital signs, lab results, and demographics from the prior two years of patient data. The XGB model was compared to logistic regression (LR) and artificial neural network (ANN) models. The models did not use blast percentage and cytogenetics information as inputs. Predictions were made one year prior to MDS diagnosis as determined by International Classification of Diseases (ICD) codes, 9th and 10th revisions. Performance was assessed with regard to area under the receiver operating characteristic curve (AUROC). RESULTS On a hold-out test set, the XGB model achieved an AUROC value of 0.87 for prediction of MDS one year prior to diagnosis, with a sensitivity of 0.79 and specificity of 0.80. The XGB model was compared against LR and ANN models, which achieved an AUROC of 0.838 and 0.832, respectively. CONCLUSIONS Machine learning may allow for early MDS diagnosis MDS and more appropriate treatment administration.
Collapse
|
Journal Article |
4 |
12 |
4
|
Thapa R, Garikipati A, Shokouhi S, Hurtado M, Barnes G, Hoffman J, Calvert J, Katzmann L, Mao Q, Das R. Usability of Electronic Health records in Predicting Short-term falls: Machine learning Applications in Senior Care Facilities (Preprint). JMIR Aging 2021; 5:e35373. [PMID: 35363146 PMCID: PMC9015781 DOI: 10.2196/35373] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/16/2022] [Accepted: 02/07/2022] [Indexed: 11/23/2022] Open
Abstract
Background Short-term fall prediction models that use electronic health records (EHRs) may enable the implementation of dynamic care practices that specifically address changes in individualized fall risk within senior care facilities. Objective The aim of this study is to implement machine learning (ML) algorithms that use EHR data to predict a 3-month fall risk in residents from a variety of senior care facilities providing different levels of care. Methods This retrospective study obtained EHR data (2007-2021) from Juniper Communities’ proprietary database of 2785 individuals primarily residing in skilled nursing facilities, independent living facilities, and assisted living facilities across the United States. We assessed the performance of 3 ML-based fall prediction models and the Juniper Communities’ fall risk assessment. Additional analyses were conducted to examine how changes in the input features, training data sets, and prediction windows affected the performance of these models. Results The Extreme Gradient Boosting model exhibited the highest performance, with an area under the receiver operating characteristic curve of 0.846 (95% CI 0.794-0.894), specificity of 0.848, diagnostic odds ratio of 13.40, and sensitivity of 0.706, while achieving the best trade-off in balancing true positive and negative rates. The number of active medications was the most significant feature associated with fall risk, followed by a resident’s number of active diseases and several variables associated with vital signs, including diastolic blood pressure and changes in weight and respiratory rates. The combination of vital signs with traditional risk factors as input features achieved higher prediction accuracy than using either group of features alone. Conclusions This study shows that the Extreme Gradient Boosting technique can use a large number of features from EHR data to make short-term fall predictions with a better performance than that of conventional fall risk assessments and other ML models. The integration of routinely collected EHR data, particularly vital signs, into fall prediction models may generate more accurate fall risk surveillance than models without vital signs. Our data support the use of ML models for dynamic, cost-effective, and automated fall predictions in different types of senior care facilities.
Collapse
|
|
4 |
11 |
5
|
Tso CF, Garikipati A, Green-Saxena A, Mao Q, Das R. Correlation of Population SARS-CoV-2 Cycle Threshold Values to Local Disease Dynamics: Exploratory Observational Study. JMIR Public Health Surveill 2021; 7:e28265. [PMID: 33999831 PMCID: PMC8176948 DOI: 10.2196/28265] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 04/20/2021] [Accepted: 04/26/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Despite the limitations in the use of cycle threshold (CT) values for individual patient care, population distributions of CT values may be useful indicators of local outbreaks. OBJECTIVE We aimed to conduct an exploratory analysis of potential correlations between the population distribution of cycle threshold (CT) values and COVID-19 dynamics, which were operationalized as percent positivity, transmission rate (Rt), and COVID-19 hospitalization count. METHODS In total, 148,410 specimens collected between September 15, 2020, and January 11, 2021, from the greater El Paso area were processed in the Dascena COVID-19 Laboratory. The daily median CT value, daily Rt, daily count of COVID-19 hospitalizations, daily change in percent positivity, and rolling averages of these features were plotted over time. Two-way scatterplots and linear regression were used to evaluate possible associations between daily median CT values and outbreak measures. Cross-correlation plots were used to determine whether a time delay existed between changes in daily median CT values and measures of community disease dynamics. RESULTS Daily median CT values negatively correlated with the daily Rt values (P<.001), the daily COVID-19 hospitalization counts (with a 33-day time delay; P<.001), and the daily changes in percent positivity among testing samples (P<.001). Despite visual trends suggesting time delays in the plots for median CT values and outbreak measures, a statistically significant delay was only detected between changes in median CT values and COVID-19 hospitalization counts (P<.001). CONCLUSIONS This study adds to the literature by analyzing samples collected from an entire geographical area and contextualizing the results with other research investigating population CT values.
Collapse
|
Observational Study |
4 |
11 |
6
|
Ghandian S, Thapa R, Garikipati A, Barnes G, Green‐Saxena A, Calvert J, Mao Q, Das R. Machine learning to predict progression of non-alcoholic fatty liver to non-alcoholic steatohepatitis or fibrosis. JGH Open 2022; 6:196-204. [PMID: 35355667 PMCID: PMC8938756 DOI: 10.1002/jgh3.12716] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 11/15/2021] [Accepted: 02/06/2022] [Indexed: 12/12/2022]
Abstract
Background Non-alcoholic fatty liver (NAFL) can progress to the severe subtype non-alcoholic steatohepatitis (NASH) and/or fibrosis, which are associated with increased morbidity, mortality, and healthcare costs. Current machine learning studies detect NASH; however, this study is unique in predicting the progression of NAFL patients to NASH or fibrosis. Aim To utilize clinical information from NAFL-diagnosed patients to predict the likelihood of progression to NASH or fibrosis. Methods Data were collected from electronic health records of patients receiving a first-time NAFL diagnosis. A gradient boosted machine learning algorithm (XGBoost) as well as logistic regression (LR) and multi-layer perceptron (MLP) models were developed. A five-fold cross-validation grid search was utilized for hyperparameter optimization of variables, including maximum tree depth, learning rate, and number of estimators. Predictions of patients likely to progress to NASH or fibrosis within 4 years of initial NAFL diagnosis were made using demographic features, vital signs, and laboratory measurements. Results The XGBoost algorithm achieved area under the receiver operating characteristic (AUROC) values of 0.79 for prediction of progression to NASH and 0.87 for fibrosis on both hold-out and external validation test sets. The XGBoost algorithm outperformed the LR and MLP models for both NASH and fibrosis prediction on all metrics. Conclusion It is possible to accurately identify newly diagnosed NAFL patients at high risk of progression to NASH or fibrosis. Early identification of these patients may allow for increased clinical monitoring, more aggressive preventative measures to slow the progression of NAFL and fibrosis, and efficient clinical trial enrollment.
Collapse
|
research-article |
3 |
7 |
7
|
Radhachandran A, Garikipati A, Zelin NS, Pellegrini E, Ghandian S, Calvert J, Hoffman J, Mao Q, Das R. Prediction of short-term mortality in acute heart failure patients using minimal electronic health record data. BioData Min 2021; 14:23. [PMID: 33789700 PMCID: PMC8010502 DOI: 10.1186/s13040-021-00255-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 03/21/2021] [Indexed: 12/15/2022] Open
Abstract
Background Acute heart failure (AHF) is associated with significant morbidity and mortality. Effective patient risk stratification is essential to guiding hospitalization decisions and the clinical management of AHF. Clinical decision support systems can be used to improve predictions of mortality made in emergency care settings for the purpose of AHF risk stratification. In this study, several models for the prediction of seven-day mortality among AHF patients were developed by applying machine learning techniques to retrospective patient data from 236,275 total emergency department (ED) encounters, 1881 of which were considered positive for AHF and were used for model training and testing. The models used varying subsets of age, sex, vital signs, and laboratory values. Model performance was compared to the Emergency Heart Failure Mortality Risk Grade (EHMRG) model, a commonly used system for prediction of seven-day mortality in the ED with similar (or, in some cases, more extensive) inputs. Model performance was assessed in terms of area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity. Results When trained and tested on a large academic dataset, the best-performing model and EHMRG demonstrated test set AUROCs of 0.84 and 0.78, respectively, for prediction of seven-day mortality. Given only measurements of respiratory rate, temperature, mean arterial pressure, and FiO2, one model produced a test set AUROC of 0.83. Neither a logistic regression comparator nor a simple decision tree outperformed EHMRG. Conclusions A model using only the measurements of four clinical variables outperforms EHMRG in the prediction of seven-day mortality in AHF. With these inputs, the model could not be replaced by logistic regression or reduced to a simple decision tree without significant performance loss. In ED settings, this minimal-input risk stratification tool may assist clinicians in making critical decisions about patient disposition by providing early and accurate insights into individual patient’s risk profiles. Supplementary Information The online version contains supplementary material available at 10.1186/s13040-021-00255-w.
Collapse
|
Journal Article |
4 |
6 |
8
|
Panchavati S, Lam C, Zelin NS, Pellegrini E, Barnes G, Hoffman J, Garikipati A, Calvert J, Mao Q, Das R. Retrospective validation of a machine learning clinical decision support tool for myocardial infarction risk stratification. Healthc Technol Lett 2021; 8:139-147. [PMID: 34938570 PMCID: PMC8667565 DOI: 10.1049/htl2.12017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 05/26/2021] [Accepted: 06/10/2021] [Indexed: 12/22/2022] Open
Abstract
Diagnosis and appropriate intervention for myocardial infarction (MI) are time-sensitive but rely on clinical measures that can be progressive and initially inconclusive, underscoring the need for an accurate and early predictor of MI to support diagnostic and clinical management decisions. The objective of this study was to develop a machine learning algorithm (MLA) to predict MI diagnosis based on electronic health record data (EHR) readily available during Emergency Department assessment. An MLA was developed using retrospective patient data. The MLA used patient data as they became available in the first 3 h of care to predict MI diagnosis (defined by International Classification of Diseases, 10th revision code) at any time during the encounter. The MLA obtained an area under the receiver operating characteristic curve of 0.87, sensitivity of 87% and specificity of 70%, outperforming the comparator scoring systems TIMI and GRACE on all metrics. An MLA can synthesize complex EHR data to serve as a clinically relevant risk stratification tool for MI.
Collapse
|
research-article |
4 |
5 |
9
|
Maharjan J, Garikipati A, Dinenno FA, Ciobanu M, Barnes G, Browning E, DeCurzio J, Mao Q, Das R. Machine learning determination of applied behavioral analysis treatment plan type. Brain Inform 2023; 10:7. [PMID: 36862316 PMCID: PMC9981822 DOI: 10.1186/s40708-023-00186-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 02/06/2023] [Indexed: 03/03/2023] Open
Abstract
BACKGROUND Applied behavioral analysis (ABA) is regarded as the gold standard treatment for autism spectrum disorder (ASD) and has the potential to improve outcomes for patients with ASD. It can be delivered at different intensities, which are classified as comprehensive or focused treatment approaches. Comprehensive ABA targets multiple developmental domains and involves 20-40 h/week of treatment. Focused ABA targets individual behaviors and typically involves 10-20 h/week of treatment. Determining the appropriate treatment intensity involves patient assessment by trained therapists, however, the final determination is highly subjective and lacks a standardized approach. In our study, we examined the ability of a machine learning (ML) prediction model to classify which treatment intensity would be most suited individually for patients with ASD who are undergoing ABA treatment. METHODS Retrospective data from 359 patients diagnosed with ASD were analyzed and included in the training and testing of an ML model for predicting comprehensive or focused treatment for individuals undergoing ABA treatment. Data inputs included demographics, schooling, behavior, skills, and patient goals. A gradient-boosted tree ensemble method, XGBoost, was used to develop the prediction model, which was then compared against a standard of care comparator encompassing features specified by the Behavior Analyst Certification Board treatment guidelines. Prediction model performance was assessed via area under the receiver-operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). RESULTS The prediction model achieved excellent performance for classifying patients in the comprehensive versus focused treatment groups (AUROC: 0.895; 95% CI 0.811-0.962) and outperformed the standard of care comparator (AUROC 0.767; 95% CI 0.629-0.891). The prediction model also achieved sensitivity of 0.789, specificity of 0.808, PPV of 0.6, and NPV of 0.913. Out of 71 patients whose data were employed to test the prediction model, only 14 misclassifications occurred. A majority of misclassifications (n = 10) indicated comprehensive ABA treatment for patients that had focused ABA treatment as the ground truth, therefore still providing a therapeutic benefit. The three most important features contributing to the model's predictions were bathing ability, age, and hours per week of past ABA treatment. CONCLUSION This research demonstrates that the ML prediction model performs well to classify appropriate ABA treatment plan intensity using readily available patient data. This may aid with standardizing the process for determining appropriate ABA treatments, which can facilitate initiation of the most appropriate treatment intensity for patients with ASD and improve resource allocation.
Collapse
|
research-article |
2 |
3 |
10
|
Garikipati A, Ciobanu M, Singh NP, Barnes G, Decurzio J, Mao Q, Das R. Clinical Outcomes of a Hybrid Model Approach to Applied Behavioral Analysis Treatment. Cureus 2023; 15:e36727. [PMID: 36998917 PMCID: PMC10047423 DOI: 10.7759/cureus.36727] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/24/2023] [Indexed: 03/29/2023] Open
Abstract
Objective This study examines the implementation of a hybrid applied behavioral analysis (ABA) treatment model to determine its impact on autism spectrum disorder (ASD) patient outcomes. Methods Retrospective data were collected for 25 pediatric patients to measure progress before and after the implementation of a hybrid ABA treatment model under which therapists consistently captured session notes electronically regarding goals and patient progress. ABA treatment was streamlined for consistent delivery, with improved software utilization for tracking scheduling and progress. Eleven goals within three domains (behavioral, social, and communication) were examined. Results After the implementation of the hybrid model, the goal success rate improved by 9.7% compared to the baseline; 41.8% of goals showed improvement, 38.4% showed a flat trend, and 19.8% showed deterioration. Multiple goals trended upwards in 76% of the patients. Conclusion This pilot study demonstrated that enhancing the consistency with which ABA treatment is monitored/delivered can improve patient outcomes as seen through improved attainment of goals.
Collapse
|
|
2 |
2 |
11
|
Thapa R, Garikipati A, Ciobanu M, Singh NP, Browning E, DeCurzio J, Barnes G, Dinenno FA, Mao Q, Das R. Machine Learning Differentiation of Autism Spectrum Sub-Classifications. J Autism Dev Disord 2024; 54:4216-4231. [PMID: 37751097 PMCID: PMC11461775 DOI: 10.1007/s10803-023-06121-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/19/2023] [Indexed: 09/27/2023]
Abstract
PURPOSE Disorders on the autism spectrum have characteristics that can manifest as difficulties with communication, executive functioning, daily living, and more. These challenges can be mitigated with early identification. However, diagnostic criteria has changed from DSM-IV to DSM-5, which can make diagnosing a disorder on the autism spectrum complex. We evaluated machine learning to classify individuals as having one of three disorders of the autism spectrum under DSM-IV, or as non-spectrum. METHODS We employed machine learning to analyze retrospective data from 38,560 individuals. Inputs encompassed clinical, demographic, and assessment data. RESULTS The algorithm achieved AUROCs ranging from 0.863 to 0.980. The model correctly classified 80.5% individuals; 12.6% of individuals from this dataset were misclassified with another disorder on the autism spectrum. CONCLUSION Machine learning can classify individuals as having a disorder on the autism spectrum or as non-spectrum using minimal data inputs.
Collapse
|
research-article |
1 |
1 |
12
|
Garikipati A, Ciobanu M, Singh NP, Barnes G, Dinenno FA, Geisel J, Mao Q, Das R. Parent-Led Applied Behavior Analysis to Impact Clinical Outcomes for Individuals on the Autism Spectrum: Retrospective Chart Review. JMIR Pediatr Parent 2024; 7:e62878. [PMID: 39476396 PMCID: PMC11540247 DOI: 10.2196/62878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 09/06/2024] [Accepted: 09/20/2024] [Indexed: 11/08/2024] Open
Abstract
Background Autism spectrum disorder (ASD) can have traits that impact multiple domains of functioning and quality of life, which can persevere throughout life. To mitigate the impact of ASD on the long-term trajectory of an individual's life, it is imperative to seek early and adequate treatment via scientifically validated approaches, of which applied behavior analysis (ABA) is the gold standard. ABA treatment must be delivered via a behavior technician with oversight from a board-certified behavior analyst. However, shortages in certified ABA therapists create treatment access barriers for individuals on the autism spectrum. Increased ASD prevalence demands innovations for treatment delivery. Parent-led treatment models for neurodevelopmental conditions are effective yet underutilized and may be used to fill this care gap. Objective This study reports findings from a retrospective chart review of clinical outcomes for children that received parent-led ABA treatment and intends to examine the sustained impact that modifications to ABA delivery have had on a subset of patients of Montera, Inc. dba Forta ("Forta"), as measured by progress toward skill acquisition within multiple focus areas (FAs). Methods Parents received ≥40 hours of training in ABA prior to initiating treatment, and patients were prescribed focused (<25 hours/week) or comprehensive (>25-40 hours/week) treatment plans. Retrospective data were evaluated over ≥90 days for 30 patients. The clinical outcomes of patients were additionally assessed by age (2-5 years, 6-12 years, 13-22 years) and utilization of prescribed treatment. Treatment encompassed skill acquisition goals; to facilitate data collection consistency, successful attempts were logged within a software application built in-house. Results Improved goal achievement success between weeks 1-20 was observed for older age, all utilization, and both treatment plan type cohorts. Success rates increased over time for most FAs, with the exception of executive functioning in the youngest cohort and comprehensive plan cohort. Goal achievement experienced peaks and declines from week to week, as expected for ABA treatment; however, overall trends indicated increased skill acquisition success rates. Of 40 unique combinations of analysis cohorts and FAs, 20 showed statistically significant positive linear relationships (P<.05). Statistically significant positive linear relationships were observed in the high utilization cohort (communication with P=.04, social skills with P=.02); in the fair and full utilization cohorts (overall success with P=.03 for the fair utilization cohort and P=.001 for the full utilization cohort, and success in emotional regulation with P<.001 for the fair utilization cohort and P<.001 for the full utilization cohort); and in the comprehensive treatment cohort (communication with P=.001, emotional regulation with P=.045). Conclusions Parent-led ABA can lead to goal achievement and improved clinical outcomes and may be a viable solution to overcome treatment access barriers that delay initiation or continuation of care.
Collapse
|
research-article |
1 |
|
13
|
Panchavati S, Lam C, Garikipati A, Zelin N, Pellegrini E, Barnes G, Siefkas A, Hoffman J, Calvert J, Mao Q, Das R. A Machine-Learning Clinical Decision Support Tool for Myocardial Infarction Diagnosis. CARDIOVASCULAR REVASCULARIZATION MEDICINE 2021. [DOI: 10.1016/j.carrev.2021.06.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
|
4 |
|
14
|
Maharjan J, Garikipati A, Singh NP, Cyrus L, Sharma M, Ciobanu M, Barnes G, Thapa R, Mao Q, Das R. OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models. Sci Rep 2024; 14:14156. [PMID: 38898116 PMCID: PMC11187169 DOI: 10.1038/s41598-024-64827-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 06/13/2024] [Indexed: 06/21/2024] Open
Abstract
LLMs can accomplish specialized medical knowledge tasks, however, equitable access is hindered by the extensive fine-tuning, specialized medical data requirement, and limited access to proprietary models. Open-source (OS) medical LLMs show performance improvements and provide the transparency and compliance required in healthcare. We present OpenMedLM, a prompting platform delivering state-of-the-art (SOTA) performance for OS LLMs on medical benchmarks. We evaluated OS foundation LLMs (7B-70B) on medical benchmarks (MedQA, MedMCQA, PubMedQA, MMLU medical-subset) and selected Yi34B for developing OpenMedLM. Prompting strategies included zero-shot, few-shot, chain-of-thought, and ensemble/self-consistency voting. OpenMedLM delivered OS SOTA results on three medical LLM benchmarks, surpassing previous best-performing OS models that leveraged costly and extensive fine-tuning. OpenMedLM displays the first results to date demonstrating the ability of OS foundation models to optimize performance, absent specialized fine-tuning. The model achieved 72.6% accuracy on MedQA, outperforming the previous SOTA by 2.4%, and 81.7% accuracy on MMLU medical-subset, establishing itself as the first OS LLM to surpass 80% accuracy on this benchmark. Our results highlight medical-specific emergent properties in OS LLMs not documented elsewhere to date and validate the ability of OS models to accomplish healthcare tasks, highlighting the benefits of prompt engineering to improve performance of accessible LLMs for medical applications.
Collapse
|
research-article |
1 |
|
15
|
Adelson RP, Garikipati A, Zhou Y, Ciobanu M, Tawara K, Barnes G, Singh NP, Mao Q, Das R. Machine Learning Approach with Harmonized Multinational Datasets for Enhanced Prediction of Hypothyroidism in Patients with Type 2 Diabetes. Diagnostics (Basel) 2024; 14:1152. [PMID: 38893680 PMCID: PMC11172278 DOI: 10.3390/diagnostics14111152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 05/24/2024] [Accepted: 05/29/2024] [Indexed: 06/21/2024] Open
Abstract
Type 2 diabetes (T2D) is a global health concern with increasing prevalence. Comorbid hypothyroidism (HT) exacerbates kidney, cardiac, neurological and other complications of T2D; these risks can be mitigated pharmacologically upon detecting HT. The current HT standard of care (SOC) screening in T2D is infrequent, delaying HT diagnosis and treatment. We present a first-to-date machine learning algorithm (MLA) clinical decision tool to classify patients as low vs. high risk for developing HT comorbid with T2D; the MLA was developed using readily available patient data from harmonized multinational datasets. The MLA was trained on data from NIH All of US (AoU) and UK Biobank (UKBB) (Combined dataset) and achieved a high negative predictive value (NPV) of 0.989 and an AUROC of 0.762 in the Combined dataset, exceeding AUROCs for the models trained on AoU or UKBB alone (0.666 and 0.622, respectively), indicating that increasing dataset diversity for MLA training improves performance. This high-NPV automated tool can supplement SOC screening and rule out T2D patients with low HT risk, allowing for the prioritization of lab-based testing for at-risk patients. Conversely, an MLA output that designates a patient to be at risk of developing HT allows for tailored clinical management and thereby promotes improved patient outcomes.
Collapse
|
research-article |
1 |
|
16
|
Adelson RP, Ciobanu M, Garikipati A, Castell NJ, Singh NP, Barnes G, Rumph JK, Mao Q, Roane HS, Vaish A, Das R. Family-Centric Applied Behavior Analysis Facilitates Improved Treatment Utilization and Outcomes. J Clin Med 2024; 13:2409. [PMID: 38673682 PMCID: PMC11051390 DOI: 10.3390/jcm13082409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 04/15/2024] [Accepted: 04/19/2024] [Indexed: 04/28/2024] Open
Abstract
Background/Objective: Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by lifelong impacts on functional social and daily living skills, and restricted, repetitive behaviors (RRBs). Applied behavior analysis (ABA), the gold-standard treatment for ASD, has been extensively validated. ABA access is hindered by limited availability of qualified professionals and logistical and financial barriers. Scientifically validated, parent-led ABA can fill the accessibility gap by overcoming treatment barriers. This retrospective cohort study examines how our ABA treatment model, utilizing parent behavior technicians (pBTs) to deliver ABA, impacts adaptive behaviors and interfering behaviors (IBs) in a cohort of children on the autism spectrum with varying ASD severity levels, and with or without clinically significant IBs. Methods: Clinical outcomes of 36 patients ages 3-15 years were assessed using longitudinal changes in Vineland-3 after 3+ months of pBT-delivered ABA treatment. Results: Within the pBT model, our patients demonstrated clinically significant improvements in Vineland-3 Composite, domain, and subdomain scores, and utilization was higher in severe ASD. pBTs utilized more prescribed ABA when children initiated treatment with clinically significant IBs, and these children also showed greater gains in their Composite scores. Study limitations include sample size, inter-rater reliability, potential assessment metric bias and schedule variability, and confounding intrinsic or extrinsic factors. Conclusion: Overall, our pBT model facilitated high treatment utilization and showed robust effectiveness, achieving improved adaptive behaviors and reduced IBs when compared to conventional ABA delivery. The pBT model is a strong contender to fill the widening treatment accessibility gap and represents a powerful tool for addressing systemic problems in ABA treatment delivery.
Collapse
|
research-article |
1 |
|
17
|
Varma A, Maharjan J, Garikipati A, Hurtado M, Shokouhi S, Mao Q. Early prediction of prostate cancer risk in younger men using polygenic risk scores and electronic health records. Cancer Med 2023; 12:379-386. [PMID: 35751453 PMCID: PMC9844630 DOI: 10.1002/cam4.4934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/04/2022] [Accepted: 05/24/2022] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Prostate cancer (PCa) screening is not routinely conducted in men aged 55 and younger, although this age group accounts for more than 10% of cases. Polygenic risk scores (PRSs) and patient data applied toward early prediction of PCa may lead to earlier interventions and increased survival. We have developed machine learning (ML) models to predict PCa risk in men 55 and under using PRSs combined with patient data. METHODS We conducted a retrospective study on 91,106 male patients aged 35-55 using the UK Biobank database. Five gradient boosting models were developed and validated utilizing routine screening data, PRSs, additional clinical data, or combinations of the three. RESULTS Combinations of PRSs and patient data outperformed models that utilized PRS or patient data only, and the highest performing models achieved an area under the receiver operating characteristic curve of 0.788. Our models demonstrated a substantially lower false positive rate (35.4%) in comparison to standard screening using prostate-specific antigen (60%-67%). CONCLUSION This study provides the first preliminary evidence for the use of PRSs with patient data in a ML algorithm for PCa risk prediction in men aged 55 and under for whom screening is not standard practice.
Collapse
|
research-article |
2 |
|
18
|
Adelson RP, Ciobanu M, Garikipati A, Castell NJ, Barnes G, Tawara K, Singh NP, Rumph J, Mao Q, Vaish A, Das R. Family-Centric Applied Behavior Analysis Promotes Sustained Treatment Utilization and Attainment of Patient Goals. Cureus 2024; 16:e62377. [PMID: 39011193 PMCID: PMC11247253 DOI: 10.7759/cureus.62377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2024] [Indexed: 07/17/2024] Open
Abstract
BACKGROUND/OBJECTIVES Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by social communication difficulties and restricted repetitive behaviors or interests. Applied behavior analysis (ABA) has been shown to significantly improve outcomes for individuals on the autism spectrum. However, challenges regarding access, cost, and provider shortages remain obstacles to treatment delivery. To this end, parents were trained as parent behavior technicians (pBTs), improving access to ABA, and empowering parents to provide ABA treatment in their own homes. We hypothesized that patients diagnosed with severe ASD would achieve the largest gains in overall success rates toward skill acquisition in comparison to patients diagnosed with mild or moderate ASD. Our secondary hypothesis was that patients with comprehensive treatment plans (>25-40 hours/week) would show greater gains in skill acquisition than those with focused treatment plans (less than or equal to 25 hours/week). Methods: This longitudinal, retrospective chart review evaluated data from 243 patients aged two to 18 years who received at least three months of ABA within our pBT treatment delivery model. Patients were stratified by utilization of prescribed ABA treatment, age, ASD severity (per the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition), and treatment plan type (comprehensive vs. focused). Patient outcomes were assessed by examining success rates in acquiring skills, both overall and in specific focus areas (communication, emotional regulation, executive functioning, and social skills). RESULTS Patients receiving treatment within the pBT model demonstrated significant progress in skill acquisition both overall and within specific focus areas, regardless of cohort stratification. Patients with severe ASD showed greater overall skill acquisition gains than those with mild or moderate ASD. In addition, patients with comprehensive treatment plans showed significantly greater gains than those with focused treatment plans. CONCLUSION The pBT model achieved both sustained levels of high treatment utilization and progress toward patient goals. Patients showed significant gains in success rates of skill acquisition both overall and in specific focus areas, regardless of their level of treatment utilization. This study reveals that our pBT model of ABA treatment delivery leads to consistent improvements in communication, emotional regulation, executive functioning, and social skills across patients on the autism spectrum, particularly for those with more severe symptoms and those following comprehensive treatment plans.
Collapse
|
|
1 |
|
19
|
Pellegrini E, Panchavati S, Lam C, Garikipati A, Zelin N, Barnes G, Siefkas A, Hoffman J, Handley M, Calvert J, Mao Q, Das R. A MACHINE LEARNING CLINICAL DECISION SUPPORT TOOL FOR MYOCARDIAL INFARCTION DIAGNOSIS. J Am Coll Cardiol 2021. [DOI: 10.1016/s0735-1097(21)02012-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
|
4 |
|
20
|
Adelson RP, Garikipati A, Maharjan J, Ciobanu M, Barnes G, Singh NP, Dinenno FA, Mao Q, Das R. Machine Learning Approach for Improved Longitudinal Prediction of Progression from Mild Cognitive Impairment to Alzheimer's Disease. Diagnostics (Basel) 2023; 14:13. [PMID: 38201322 PMCID: PMC10795823 DOI: 10.3390/diagnostics14010013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/08/2023] [Accepted: 12/15/2023] [Indexed: 01/12/2024] Open
Abstract
Mild cognitive impairment (MCI) is cognitive decline that can indicate future risk of Alzheimer's disease (AD). We developed and validated a machine learning algorithm (MLA), based on a gradient-boosted tree ensemble method, to analyze phenotypic data for individuals 55-88 years old (n = 493) diagnosed with MCI. Data were analyzed within multiple prediction windows and averaged to predict progression to AD within 24-48 months. The MLA outperformed the mini-mental state examination (MMSE) and three comparison models at all prediction windows on most metrics. Exceptions include sensitivity at 18 months (MLA and MMSE each achieved 0.600); and sensitivity at 30 and 42 months (MMSE marginally better). For all prediction windows, the MLA achieved AUROC ≥ 0.857 and NPV ≥ 0.800. With averaged data for the 24-48-month lookahead timeframe, the MLA outperformed MMSE on all metrics. This study demonstrates that machine learning may provide a more accurate risk assessment than the standard of care. This may facilitate care coordination, decrease healthcare expenditures, and maintain quality of life for patients at risk of progressing from MCI to AD.
Collapse
|
research-article |
2 |
|