1
|
Zad Z, Jiang VS, Wolf AT, Wang T, Cheng JJ, Paschalidis IC, Mahalingaiah S. Predicting polycystic ovary syndrome with machine learning algorithms from electronic health records. Front Endocrinol (Lausanne) 2024; 15:1298628. [PMID: 38356959 PMCID: PMC10866556 DOI: 10.3389/fendo.2024.1298628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/08/2024] [Indexed: 02/16/2024] Open
Abstract
Introduction Predictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. Methods This is a retrospective cohort study from a SafetyNet hospital's electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. Results We developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. Conclusion Machine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.
Collapse
Affiliation(s)
- Zahra Zad
- Division of Systems Engineering, Center for Information and Systems Engineering (CISE), Boston University, Brookline, MA, United States
| | - Victoria S. Jiang
- Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Massachusetts General Hospital, Boston, MA, United States
| | - Amber T. Wolf
- Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Taiyao Wang
- Division of Systems Engineering, Center for Information and Systems Engineering (CISE), Boston University, Brookline, MA, United States
| | - J. Jojo Cheng
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, United States
| | - Ioannis Ch. Paschalidis
- Division of Systems Engineering, Center for Information and Systems Engineering (CISE), Boston University, Brookline, MA, United States
- Department of Electrical & Computer Engineering, Department of Biomedical Engineering, and Faculty for Computing & Data Sciences, Boston University, Boston, MA, United States
| | - Shruthi Mahalingaiah
- Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Massachusetts General Hospital, Boston, MA, United States
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA, United States
| |
Collapse
|
2
|
Coleman GT, Al Snih S. Diabetes and Hospitalizations Among Mexican Americans Aged 75 Years and Older. J Prim Care Community Health 2024; 15:21501319241266108. [PMID: 39058533 PMCID: PMC11282514 DOI: 10.1177/21501319241266108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 06/07/2024] [Accepted: 06/12/2024] [Indexed: 07/28/2024] Open
Abstract
OBJECTIVE To examine factors associated with hospitalization among Mexican Americans aged 75 years and older with diabetes (with and without complications) and without diabetes over 12 years of follow up. METHODS Participants (N = 1454) were from the Hispanic Established Population for the Epidemiologic Study of the Elderly (2004/2005-2016) residing in Arizona, California, Colorado, New Mexico, and Texas. Measures included socio-demographics, medical conditions, falls, depressive symptoms, cognitive function, disability, physician visits, and hospitalizations. Participants were categorized as no diabetes (N = 1028), diabetes without complications (N = 180), and diabetes with complications (N = 246). RESULTS Participants with diabetes and complications had greater odds ratio (1.56, 95% Confidence Interval = 1.23-1.98) over time of being admitted to the hospital in the prior year versus those without diabetes. Participants with diabetes had greater odds of hospitalization if they had heart failure, falls, amputation, and insulin treatment. CONCLUSIONS In Mexican American older adults, diabetes and diabetes-related complications increased the risk of hospitalization.
Collapse
Affiliation(s)
| | - Soham Al Snih
- The University of Texas Medical Branch, Galveston, TX, USA
| |
Collapse
|
3
|
Zad Z, Jiang VS, Wolf AT, Wang T, Cheng JJ, Paschalidis IC, Mahalingaiah S. Predicting polycystic ovary syndrome (PCOS) with machine learning algorithms from electronic health records. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.27.23293255. [PMID: 37577593 PMCID: PMC10418575 DOI: 10.1101/2023.07.27.23293255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Introduction Predictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. Methods This is a retrospective cohort study from a SafetyNet hospital's electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. Results We developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. Conclusions Machine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.
Collapse
|
4
|
Hu Y, Huerta J, Cordella N, Mishuris RG, Paschalidis IC. Personalized hypertension treatment recommendations by a data-driven model. BMC Med Inform Decis Mak 2023; 23:44. [PMID: 36859187 PMCID: PMC9979505 DOI: 10.1186/s12911-023-02137-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 02/09/2023] [Indexed: 03/03/2023] Open
Abstract
BACKGROUND Hypertension is a prevalent cardiovascular disease with severe longer-term implications. Conventional management based on clinical guidelines does not facilitate personalized treatment that accounts for a richer set of patient characteristics. METHODS Records from 1/1/2012 to 1/1/2020 at the Boston Medical Center were used, selecting patients with either a hypertension diagnosis or meeting diagnostic criteria (≥ 130 mmHg systolic or ≥ 90 mmHg diastolic, n = 42,752). Models were developed to recommend a class of antihypertensive medications for each patient based on their characteristics. Regression immunized against outliers was combined with a nearest neighbor approach to associate with each patient an affinity group of other patients. This group was then used to make predictions of future Systolic Blood Pressure (SBP) under each prescription type. For each patient, we leveraged these predictions to select the class of medication that minimized their future predicted SBP. RESULTS The proposed model, built with a distributionally robust learning procedure, leads to a reduction of 14.28 mmHg in SBP, on average. This reduction is 70.30% larger than the reduction achieved by the standard-of-care and 7.08% better than the corresponding reduction achieved by the 2nd best model which uses ordinary least squares regression. All derived models outperform following the previous prescription or the current ground truth prescription in the record. We randomly sampled and manually reviewed 350 patient records; 87.71% of these model-generated prescription recommendations passed a sanity check by clinicians. CONCLUSION Our data-driven approach for personalized hypertension treatment yielded significant improvement compared to the standard-of-care. The model implied potential benefits of computationally deprescribing and can support situations with clinical equipoise.
Collapse
Affiliation(s)
- Yang Hu
- Department of Electrical and Computer Engineering, Division of Systems Engineering, Boston University, 8 Saint Mary's St., Boston, MA, 02215, USA
| | - Jasmine Huerta
- Department of Medicine, Boston Medical Center, School of Medicine, Boston University, Boston, MA, USA
| | - Nicholas Cordella
- Department of Medicine, Boston Medical Center, School of Medicine, Boston University, Boston, MA, USA
| | - Rebecca G Mishuris
- Department of Medicine, Boston Medical Center, School of Medicine, Boston University, Boston, MA, USA
| | - Ioannis Ch Paschalidis
- Department of Electrical and Computer Engineering, Division of Systems Engineering, Boston University, 8 Saint Mary's St., Boston, MA, 02215, USA.
- Department of Biomedical Engineering, Faculty of Computing & Data Sciences, Hariri Institute for Computing and Computational Science & Engineering, Boston University, 8 Saint Mary's St., Boston, MA, 02215, USA.
| |
Collapse
|
5
|
Kanda E, Suzuki A, Makino M, Tsubota H, Kanemata S, Shirakawa K, Yajima T. Machine learning models for prediction of HF and CKD development in early-stage type 2 diabetes patients. Sci Rep 2022; 12:20012. [PMID: 36411366 PMCID: PMC9678863 DOI: 10.1038/s41598-022-24562-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 11/17/2022] [Indexed: 11/23/2022] Open
Abstract
Chronic kidney disease (CKD) and heart failure (HF) are the first and most frequent comorbidities associated with mortality risks in early-stage type 2 diabetes mellitus (T2DM). However, efficient screening and risk assessment strategies for identifying T2DM patients at high risk of developing CKD and/or HF (CKD/HF) remains to be established. This study aimed to generate a novel machine learning (ML) model to predict the risk of developing CKD/HF in early-stage T2DM patients. The models were derived from a retrospective cohort of 217,054 T2DM patients without a history of cardiovascular and renal diseases extracted from a Japanese claims database. Among algorithms used for the ML, extreme gradient boosting exhibited the best performance for CKD/HF diagnosis and hospitalization after internal validation and was further validated using another dataset including 16,822 patients. In the external validation, 5-years prediction area under the receiver operating characteristic curves for CKD/HF diagnosis and hospitalization were 0.718 and 0.837, respectively. In Kaplan-Meier curves analysis, patients predicted to be at high risk showed significant increase in CKD/HF diagnosis and hospitalization compared with those at low risk. Thus, the developed model predicted the risk of developing CKD/HF in T2DM patients with reasonable probability in the external validation cohort. Clinical approach identifying T2DM at high risk of developing CKD/HF using ML models may contribute to improved prognosis by promoting early diagnosis and intervention.
Collapse
Affiliation(s)
- Eiichiro Kanda
- grid.415086.e0000 0001 1014 2000Medical Science, Kawasaki Medical University, Okayama, Japan
| | - Atsushi Suzuki
- grid.256115.40000 0004 1761 798XDepartment of Endocrinology, Diabetes and Metabolism, Fujita Health University, Toyoake, Aichi Japan
| | - Masaki Makino
- grid.256115.40000 0004 1761 798XDepartment of Endocrinology, Diabetes and Metabolism, Fujita Health University, Toyoake, Aichi Japan
| | - Hiroo Tsubota
- grid.476017.30000 0004 0376 5631AstraZeneca K.K., Osaka, Japan
| | - Satomi Kanemata
- grid.459873.40000 0004 0376 2510Ono Pharmaceutical Co., Ltd., Osaka, Japan
| | | | | |
Collapse
|
6
|
Nicolucci A, Romeo L, Bernardini M, Vespasiani M, Rossi MC, Petrelli M, Ceriello A, Di Bartolo P, Frontoni E, Vespasiani G. Prediction of complications of type 2 Diabetes: A Machine learning approach. Diabetes Res Clin Pract 2022; 190:110013. [PMID: 35870573 DOI: 10.1016/j.diabres.2022.110013] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 07/11/2022] [Accepted: 07/16/2022] [Indexed: 11/03/2022]
Abstract
AIM To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records. METHODS Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3-5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC). RESULTS For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy). CONCLUSIONS Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.
Collapse
Affiliation(s)
- Antonio Nicolucci
- Center for Outcomes Research and Clinical Epidemiology - CORESEARCH, Pescara, Italy.
| | - Luca Romeo
- Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy
| | - Michele Bernardini
- Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy
| | | | - Maria Chiara Rossi
- Center for Outcomes Research and Clinical Epidemiology - CORESEARCH, Pescara, Italy
| | - Massimiliano Petrelli
- Clinic of Endocrinology and Metabolic Diseases, Department of Clinical and Molecular Sciences, Marche Polytechnic University, Ancona, Italy
| | | | | | - Emanuele Frontoni
- Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy
| | | |
Collapse
|
7
|
Brady V, Whisenant M, Wang X, Ly VK, Zhu G, Aguilar D, Wu H. Characterization of Symptoms and Symptom Clusters for Type 2 Diabetes Using a Large Nationwide Electronic Health Record Database. Diabetes Spectr 2022; 35:159-170. [PMID: 35668892 PMCID: PMC9160545 DOI: 10.2337/ds21-0064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE A variety of symptoms may be associated with type 2 diabetes and its complications. Symptoms in chronic diseases may be described in terms of prevalence, severity, and trajectory and often co-occur in groups, known as symptom clusters, which may be representative of a common etiology. The purpose of this study was to characterize type 2 diabetes-related symptoms using a large nationwide electronic health record (EHR) database. METHODS We acquired the Cerner Health Facts, a nationwide EHR database. The type 2 diabetes cohort (n = 1,136,301 patients) was identified using a rule-based phenotype method. A multistep procedure was then used to identify type 2 diabetes-related symptoms based on International Classification of Diseases, 9th and 10th revisions, diagnosis codes. Type 2 diabetes-related symptoms and co-occurring symptom clusters, including their temporal patterns, were characterized based the longitudinal EHR data. RESULTS Patients had a mean age of 61.4 years, 51.2% were female, and 70.0% were White. Among 1,136,301 patients, there were 8,008,276 occurrences of 59 symptoms. The most frequently reported symptoms included pain, heartburn, shortness of breath, fatigue, and swelling, which occurred in 21-60% of the patients. We also observed over-represented type 2 diabetes symptoms, including difficulty speaking, feeling confused, trouble remembering, weakness, and drowsiness/sleepiness. Some of these are rare and difficult to detect by traditional patient-reported outcomes studies. CONCLUSION To the best of our knowledge, this is the first study to use a nationwide EHR database to characterize type 2 diabetes-related symptoms and their temporal patterns. Fifty-nine symptoms, including both over-represented and rare diabetes-related symptoms, were identified.
Collapse
Affiliation(s)
- Veronica Brady
- Cizik School of Nursing, The University of Texas Health Science Center at Houston, Houston, TX
| | - Meagan Whisenant
- Cizik School of Nursing, The University of Texas Health Science Center at Houston, Houston, TX
| | - Xueying Wang
- School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
| | - Vi K. Ly
- School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
| | - Gen Zhu
- School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
| | - David Aguilar
- McGovern School of Medicine, The University of Texas Health Science Center at Houston, Houston, TX
| | - Hulin Wu
- School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
- Corresponding author: Hulin Wu,
| |
Collapse
|
8
|
Fregoso-Aparicio L, Noguez J, Montesinos L, García-García JA. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetol Metab Syndr 2021; 13:148. [PMID: 34930452 PMCID: PMC8686642 DOI: 10.1186/s13098-021-00767-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/07/2021] [Indexed: 12/12/2022] Open
Abstract
Diabetes Mellitus is a severe, chronic disease that occurs when blood glucose levels rise above certain limits. Over the last years, machine and deep learning techniques have been used to predict diabetes and its complications. However, researchers and developers still face two main challenges when building type 2 diabetes predictive models. First, there is considerable heterogeneity in previous studies regarding techniques used, making it challenging to identify the optimal one. Second, there is a lack of transparency about the features used in the models, which reduces their interpretability. This systematic review aimed at providing answers to the above challenges. The review followed the PRISMA methodology primarily, enriched with the one proposed by Keele and Durham Universities. Ninety studies were included, and the type of model, complementary techniques, dataset, and performance parameters reported were extracted. Eighteen different types of models were compared, with tree-based algorithms showing top performances. Deep Neural Networks proved suboptimal, despite their ability to deal with big and dirty data. Balancing data and feature selection techniques proved helpful to increase the model's efficiency. Models trained on tidy datasets achieved almost perfect models.
Collapse
Affiliation(s)
- Luis Fregoso-Aparicio
- School of Engineering and Sciences, Tecnologico de Monterrey, Av Lago de Guadalupe KM 3.5, Margarita Maza de Juarez, 52926 Cd Lopez Mateos, Mexico
| | - Julieta Noguez
- School of Engineering and Sciences, Tecnologico de Monterrey, Ave. Eugenio Garza Sada 2501, 64849 Monterrey, Nuevo Leon Mexico
| | - Luis Montesinos
- School of Engineering and Sciences, Tecnologico de Monterrey, Ave. Eugenio Garza Sada 2501, 64849 Monterrey, Nuevo Leon Mexico
| | - José A. García-García
- Hospital General de Mexico Dr. Eduardo Liceaga, Dr. Balmis 148, Doctores, Cuauhtemoc, 06720 Mexico City, Mexico
| |
Collapse
|
9
|
Sekeroglu B, Tuncal K. Prediction of cancer incidence rates for the European continent using machine learning models. Health Informatics J 2021; 27:1460458220983878. [PMID: 33506703 DOI: 10.1177/1460458220983878] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Cancer is one of the most important and common public health problems on Earth that can occur in many different types. Treatments and precautions are aimed at minimizing the deaths caused by cancer; however, incidence rates continue to rise. Thus, it is important to analyze and estimate incidence rates to support the determination of more effective precautions. In this research, 2018 Cancer Datasheet of World Health Organization (WHO), is used and all countries on the European Continent are considered to analyze and predict the incidence rates until 2020, for Lung cancer, Breast cancer, Colorectal cancer, Prostate cancer and All types of cancer, which have highest incidence and mortality rates. Each cancer type is trained by six machine learning models namely, Linear Regression, Support Vector Regression, Decision Tree, Long-Short Term Memory neural network, Backpropagation neural network, and Radial Basis Function neural network according to gender types separately. Linear regression and support vector regression outperformed the other models with the R2 scores 0.99 and 0.98, respectively, in initial experiments, and then used for prediction of incidence rates of the considered cancer types. The ML models estimated that the maximum rise of incidence rates would be in colorectal cancer for females by 6%.
Collapse
Affiliation(s)
- Boran Sekeroglu
- Information Systems Engineering, Near East University, Cyprus
| | - Kubra Tuncal
- Information Systems Engineering, Near East University, Cyprus
| |
Collapse
|
10
|
Wang K, Tian J, Zheng C, Yang H, Ren J, Li C, Han Q, Zhang Y. Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning. Risk Manag Healthc Policy 2021; 14:2453-2463. [PMID: 34149290 PMCID: PMC8206455 DOI: 10.2147/rmhp.s310295] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 05/24/2021] [Indexed: 01/14/2023] Open
Abstract
PURPOSE This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors. RESULTS The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633-0.3712), AUC (0.8010, CI: 0.7974-0.8046), and Brier score (0.1769, CI: 0.1748-0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF. CONCLUSION The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF.
Collapse
Affiliation(s)
- Ke Wang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Department of Epidemiology and Biostatistics, Xuzhou Medical University, Xuzhou, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Jing Tian
- Department of Cardiology, The First Affiliated Hospital of Shanxi Medical University, Taiyuan, People’s Republic of China
| | - Chu Zheng
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Jia Ren
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
| | - Chenhao Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Qinghua Han
- Department of Cardiology, The First Affiliated Hospital of Shanxi Medical University, Taiyuan, People’s Republic of China
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| |
Collapse
|
11
|
Bakker L, Aarts J, Uyl-de Groot C, Redekop W. Economic evaluations of big data analytics for clinical decision-making: a scoping review. J Am Med Inform Assoc 2021; 27:1466-1475. [PMID: 32642750 PMCID: PMC7526472 DOI: 10.1093/jamia/ocaa102] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 04/06/2020] [Accepted: 05/11/2020] [Indexed: 12/17/2022] Open
Abstract
OBJECTIVE Much has been invested in big data analytics to improve health and reduce costs. However, it is unknown whether these investments have achieved the desired goals. We performed a scoping review to determine the health and economic impact of big data analytics for clinical decision-making. MATERIALS AND METHODS We searched Medline, Embase, Web of Science and the National Health Services Economic Evaluations Database for relevant articles. We included peer-reviewed papers that report the health economic impact of analytics that assist clinical decision-making. We extracted the economic methods and estimated impact and also assessed the quality of the methods used. In addition, we estimated how many studies assessed "big data analytics" based on a broad definition of this term. RESULTS The search yielded 12 133 papers but only 71 studies fulfilled all eligibility criteria. Only a few papers were full economic evaluations; many were performed during development. Papers frequently reported savings for healthcare payers but only 20% also included costs of analytics. Twenty studies examined "big data analytics" and only 7 reported both cost-savings and better outcomes. DISCUSSION The promised potential of big data is not yet reflected in the literature, partly since only a few full and properly performed economic evaluations have been published. This and the lack of a clear definition of "big data" limit policy makers and healthcare professionals from determining which big data initiatives are worth implementing.
Collapse
Affiliation(s)
- Lytske Bakker
- Erasmus School of Health Policy and Management, Erasmus University, Rotterdam, Netherlands.,Institute for Medical Technology Assessment, Erasmus University, Rotterdam, Netherlands
| | - Jos Aarts
- Erasmus School of Health Policy and Management, Erasmus University, Rotterdam, Netherlands
| | - Carin Uyl-de Groot
- Erasmus School of Health Policy and Management, Erasmus University, Rotterdam, Netherlands.,Institute for Medical Technology Assessment, Erasmus University, Rotterdam, Netherlands
| | - William Redekop
- Erasmus School of Health Policy and Management, Erasmus University, Rotterdam, Netherlands.,Institute for Medical Technology Assessment, Erasmus University, Rotterdam, Netherlands
| |
Collapse
|
12
|
Ravaut M, Sadeghi H, Leung KK, Volkovs M, Kornas K, Harish V, Watson T, Lewis GF, Weisman A, Poutanen T, Rosella L. Predicting adverse outcomes due to diabetes complications with machine learning using administrative health data. NPJ Digit Med 2021; 4:24. [PMID: 33580109 PMCID: PMC7881135 DOI: 10.1038/s41746-021-00394-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 01/11/2021] [Indexed: 02/07/2023] Open
Abstract
Across jurisdictions, government and health insurance providers hold a large amount of data from patient interactions with the healthcare system. We aimed to develop a machine learning-based model for predicting adverse outcomes due to diabetes complications using administrative health data from the single-payer health system in Ontario, Canada. A Gradient Boosting Decision Tree model was trained on data from 1,029,366 patients, validated on 272,864 patients, and tested on 265,406 patients. Discrimination was assessed using the AUC statistic and calibration was assessed visually using calibration plots overall and across population subgroups. Our model predicting three-year risk of adverse outcomes due to diabetes complications (hyper/hypoglycemia, tissue infection, retinopathy, cardiovascular events, amputation) included 700 features from multiple diverse data sources and had strong discrimination (average test AUC = 77.7, range 77.7-77.9). Through the design and validation of a high-performance model to predict diabetes complications adverse outcomes at the population level, we demonstrate the potential of machine learning and administrative health data to inform health planning and healthcare resource allocation for diabetes management.
Collapse
Affiliation(s)
- Mathieu Ravaut
- Layer 6 AI, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | | | | | | | - Kathy Kornas
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Vinyas Harish
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- MD/PhD Program, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Tristan Watson
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- ICES, Toronto, ON, Canada
| | - Gary F Lewis
- Department of Medicine, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Department of Physiology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Alanna Weisman
- Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, ON, Canada
- Division of Endocrinology and Metabolism, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | | | - Laura Rosella
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
- ICES, Toronto, ON, Canada.
- Vector Institute, Toronto, ON, Canada.
- Institute for Better Health, Trillium Health Partners, Mississauga, ON, Canada.
- Department of Laboratory Medicine & Pathology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
13
|
Wang T, Paschalidis A, Liu Q, Liu Y, Yuan Y, Paschalidis IC. Predictive Models of Mortality for Hospitalized Patients With COVID-19: Retrospective Cohort Study. JMIR Med Inform 2020; 8:e21788. [PMID: 33055061 PMCID: PMC7572117 DOI: 10.2196/21788] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 07/28/2020] [Accepted: 09/15/2020] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND The novel coronavirus SARS-CoV-2 and its associated disease, COVID-19, have caused worldwide disruption, leading countries to take drastic measures to address the progression of the disease. As SARS-CoV-2 continues to spread, hospitals are struggling to allocate resources to patients who are most at risk. In this context, it has become important to develop models that can accurately predict the severity of infection of hospitalized patients to help guide triage, planning, and resource allocation. OBJECTIVE The aim of this study was to develop accurate models to predict the mortality of hospitalized patients with COVID-19 using basic demographics and easily obtainable laboratory data. METHODS We performed a retrospective study of 375 hospitalized patients with COVID-19 in Wuhan, China. The patients were randomly split into derivation and validation cohorts. Regularized logistic regression and support vector machine classifiers were trained on the derivation cohort, and accuracy metrics (F1 scores) were computed on the validation cohort. Two types of models were developed: the first type used laboratory findings from the entire length of the patient's hospital stay, and the second type used laboratory findings that were obtained no later than 12 hours after admission. The models were further validated on a multicenter external cohort of 542 patients. RESULTS Of the 375 patients with COVID-19, 174 (46.4%) died of the infection. The study cohort was composed of 224/375 men (59.7%) and 151/375 women (40.3%), with a mean age of 58.83 years (SD 16.46). The models developed using data from throughout the patients' length of stay demonstrated accuracies as high as 97%, whereas the models with admission laboratory variables possessed accuracies of up to 93%. The latter models predicted patient outcomes an average of 11.5 days in advance. Key variables such as lactate dehydrogenase, high-sensitivity C-reactive protein, and percentage of lymphocytes in the blood were indicated by the models. In line with previous studies, age was also found to be an important variable in predicting mortality. In particular, the mean age of patients who survived COVID-19 infection (50.23 years, SD 15.02) was significantly lower than the mean age of patients who died of the infection (68.75 years, SD 11.83; P<.001). CONCLUSIONS Machine learning models can be successfully employed to accurately predict outcomes of patients with COVID-19. Our models achieved high accuracies and could predict outcomes more than one week in advance; this promising result suggests that these models can be highly useful for resource allocation in hospitals.
Collapse
Affiliation(s)
- Taiyao Wang
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, United States.,Department of Biomedical Engineering, Boston University, Boston, MA, United States.,Center for Information and Systems Engineering, Boston University, Boston, MA, United States
| | | | - Quanying Liu
- Department of Biomedical Engineering, University of Science and Technology, Shenzen, China
| | - Yingxia Liu
- Third People's Hospital of Shenzhen, Second Hospital Affiliated to Southern University of Science and Technology, Shenzen, China
| | - Ye Yuan
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
| | - Ioannis Ch Paschalidis
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, United States.,Department of Biomedical Engineering, Boston University, Boston, MA, United States.,Center for Information and Systems Engineering, Boston University, Boston, MA, United States
| |
Collapse
|
14
|
Wollenstein-Betech S, Cassandras CG, Paschalidis IC. Personalized predictive models for symptomatic COVID-19 patients using basic preconditions: Hospitalizations, mortality, and the need for an ICU or ventilator. Int J Med Inform 2020; 142:104258. [PMID: 32927229 PMCID: PMC7442577 DOI: 10.1016/j.ijmedinf.2020.104258] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 07/26/2020] [Accepted: 08/17/2020] [Indexed: 01/08/2023]
Abstract
BACKGROUND The rapid global spread of the SARS-CoV-2 virus has provoked a spike in demand for hospital care. Hospital systems across the world have been over-extended, including in Northern Italy, Ecuador, and New York City, and many other systems face similar challenges. As a result, decisions on how to best allocate very limited medical resources and design targeted policies for vulnerable subgroups have come to the forefront. Specifically, under consideration are decisions on who to test, who to admit into hospitals, who to treat in an Intensive Care Unit (ICU), and who to support with a ventilator. Given today's ability to gather, share, analyze and process data, personalized predictive models based on demographics and information regarding prior conditions can be used to (1) help decision-makers allocate limited resources, when needed, (2) advise individuals how to better protect themselves given their risk profile, (3) differentiate social distancing guidelines based on risk, and (4) prioritize vaccinations once a vaccine becomes available. OBJECTIVE To develop personalized models that predict the following events: (1) hospitalization, (2) mortality, (3) need for ICU, and (4) need for a ventilator. To predict hospitalization, it is assumed that one has access to a patient's basic preconditions, which can be easily gathered without the need to be at a hospital and hence serve citizens and policy makers to assess individual risk during a pandemic. For the remaining models, different versions developed include different sets of a patient's features, with some including information on how the disease is progressing (e.g., diagnosis of pneumonia). MATERIALS AND METHODS National data from a publicly available repository, updated daily, containing information from approximately 91,000 patients in Mexico were used. The data for each patient include demographics, prior medical conditions, SARS-CoV-2 test results, hospitalization, mortality and whether a patient has developed pneumonia or not. Several classification methods were applied and compared, including robust versions of logistic regression, and support vector machines, as well as random forests and gradient boosted decision trees. RESULTS Interpretable methods (logistic regression and support vector machines) perform just as well as more complex models in terms of accuracy and detection rates, with the additional benefit of elucidating variables on which the predictions are based. Classification accuracies reached 72 %, 79 %, 89 %, and 90 % for predicting hospitalization, mortality, need for ICU and need for a ventilator, respectively. The analysis reveals the most important preconditions for making the predictions. For the four models derived, these are: (1) for hospitalization:age, pregnancy, diabetes, gender, chronic renal insufficiency, and immunosuppression; (2) for mortality: age, immunosuppression, chronic renal insufficiency, obesity and diabetes; (3) for ICU need: development of pneumonia (if available), age, obesity, diabetes and hypertension; and (4) for ventilator need: ICU and pneumonia (if available), age, obesity, and hypertension.
Collapse
Affiliation(s)
- Salomón Wollenstein-Betech
- Department of Electrical & Computer Engineering, Division of Systems Engineering, Boston University, 8 Saint Mary's St., Boston, MA 02215, USA
| | - Christos G Cassandras
- Department of Electrical & Computer Engineering, Division of Systems Engineering, Boston University, 8 Saint Mary's St., Boston, MA 02215, USA
| | - Ioannis Ch Paschalidis
- Department of Electrical & Computer Engineering, Division of Systems Engineering, Boston University, 8 Saint Mary's St., Boston, MA 02215, USA; Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA 02215, USA.
| |
Collapse
|
15
|
Bertsimas D, Li ML, Paschalidis IC, Wang T. Prescriptive analytics for reducing 30-day hospital readmissions after general surgery. PLoS One 2020; 15:e0238118. [PMID: 32903282 PMCID: PMC7480861 DOI: 10.1371/journal.pone.0238118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Accepted: 08/09/2020] [Indexed: 11/18/2022] Open
Abstract
INTRODUCTION New financial incentives, such as reduced Medicare reimbursements, have led hospitals to closely monitor their readmission rates and initiate efforts aimed at reducing them. In this context, many surgical departments participate in the American College of Surgeons National Surgical Quality Improvement Program (NSQIP), which collects detailed demographic, laboratory, clinical, procedure and perioperative occurrence data. The availability of such data enables the development of data science methods which predict readmissions and, as done in this paper, offer specific recommendations aimed at preventing readmissions. MATERIALS AND METHODS This study leverages NSQIP data for 722,101 surgeries to develop predictive and prescriptive models, predicting readmissions and offering real-time, personalized treatment recommendations for surgical patients during their hospital stay, aimed at reducing the risk of a 30-day readmission. We applied a variety of classification methods to predict 30-day readmissions and developed two prescriptive methods to recommend pre-operative blood transfusions to increase the patient's hematocrit with the objective of preventing readmissions. The effect of these interventions was evaluated using several predictive models. RESULTS Predictions of 30-day readmissions based on the entire collection of NSQIP variables achieve an out-of-sample accuracy of 87% (Area Under the Curve-AUC). Predictions based only on pre-operative variables have an accuracy of 74% AUC, out-of-sample. Personalized interventions, in the form of pre-operative blood transfusions identified by the prescriptive methods, reduce readmissions by 12%, on average, for patients considered as candidates for pre-operative transfusion (pre-operative hematoctic <30). The prediction accuracy of the proposed models exceeds results in the literature. CONCLUSIONS This study is among the first to develop a methodology for making specific, data-driven, personalized treatment recommendations to reduce the 30-day readmission rate. The reported predicted reduction in readmissions can lead to more than $20 million in savings in the U.S. annually.
Collapse
Affiliation(s)
- Dimitris Bertsimas
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Michael Lingzhi Li
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Ioannis Ch. Paschalidis
- Center for Information and Systems Engineering, Boston University, Boston, MA, United States of America
| | - Taiyao Wang
- Center for Information and Systems Engineering, Boston University, Boston, MA, United States of America
| |
Collapse
|
16
|
Wollenstein-Betech S, Cassandras CG, Paschalidis IC. Personalized Predictive Models for Symptomatic COVID-19 Patients Using Basic Preconditions: Hospitalizations, Mortality, and the Need for an ICU or Ventilator. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2020:2020.05.03.20089813. [PMID: 32511489 PMCID: PMC7273257 DOI: 10.1101/2020.05.03.20089813] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
BACKGROUND The rapid global spread of the virus SARS-CoV-2 has provoked a spike in demand for hospital care. Hospital systems across the world have been over-extended, including in Northern Italy, Ecuador, and New York City, and many other systems face similar challenges. As a result, decisions on how to best allocate very limited medical resources have come to the forefront. Specifically, under consideration are decisions on who to test, who to admit into hospitals, who to treat in an Intensive Care Unit (ICU), and who to support with a ventilator. Given today's ability to gather, share, analyze and process data, personalized predictive models based on demographics and information regarding prior conditions can be used to (1) help decision-makers allocate limited resources, when needed, (2) advise individuals how to better protect themselves given their risk profile, (3) differentiate social distancing guidelines based on risk, and (4) prioritize vaccinations once a vaccine becomes available. OBJECTIVE To develop personalized models that predict the following events: (1) hospitalization, (2) mortality, (3) need for ICU, and (4) need for a ventilator. To predict hospitalization, it is assumed that one has access to a patient's basic preconditions, which can be easily gathered without the need to be at a hospital. For the remaining models, different versions developed include different sets of a patient's features, with some including information on how the disease is progressing (e.g., diagnosis of pneumonia). MATERIALS AND METHODS Data from a publicly available repository, updated daily, containing information from approximately 91,000 patients in Mexico were used. The data for each patient include demographics, prior medical conditions, SARS-CoV-2 test results, hospitalization, mortality and whether a patient has developed pneumonia or not. Several classification methods were applied, including robust versions of logistic regression, and support vector machines, as well as random forests and gradient boosted decision trees. RESULTS Interpretable methods (logistic regression and support vector machines) perform just as well as more complex models in terms of accuracy and detection rates, with the additional benefit of elucidating variables on which the predictions are based. Classification accuracies reached 61%, 76%, 83%, and 84% for predicting hospitalization, mortality, need for ICU and need for a ventilator, respectively. The analysis reveals the most important preconditions for making the predictions. For the four models derived, these are: (1) for hospitalization: age, gender, chronic renal insufficiency, diabetes, immunosuppression; (2) for mortality: age, SARS-CoV-2 test status, immunosuppression and pregnancy; (3) for ICU need: development of pneumonia (if available), cardiovascular disease, asthma, and SARS-CoV-2 test status; and (4) for ventilator need: ICU and pneumonia (if available), age, gender, cardiovascular disease, obesity, pregnancy, and SARS-CoV-2 test result.
Collapse
Affiliation(s)
- Salomón Wollenstein-Betech
- Division of Systems Engineering, Department of Electrical and Computer Engineering, Boston University, Boston, MA 02215
| | - Christos G Cassandras
- Division of Systems Engineering, Department of Electrical and Computer Engineering, Boston University, Boston, MA 02215
| | - Ioannis Ch Paschalidis
- Division of Systems Engineering, Department of Electrical and Computer Engineering, Department of Biomedical Engineering, Boston University, Boston, MA 02215
| |
Collapse
|
17
|
Spann A, Yasodhara A, Kang J, Watt K, Wang B, Goldenberg A, Bhat M. Applying Machine Learning in Liver Disease and Transplantation: A Comprehensive Review. Hepatology 2020; 71:1093-1105. [PMID: 31907954 DOI: 10.1002/hep.31103] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 12/05/2019] [Indexed: 12/13/2022]
Abstract
Machine learning (ML) utilizes artificial intelligence to generate predictive models efficiently and more effectively than conventional methods through detection of hidden patterns within large data sets. With this in mind, there are several areas within hepatology where these methods can be applied. In this review, we examine the literature pertaining to machine learning in hepatology and liver transplant medicine. We provide an overview of the strengths and limitations of ML tools and their potential applications to both clinical and molecular data in hepatology. ML has been applied to various types of data in liver disease research, including clinical, demographic, molecular, radiological, and pathological data. We anticipate that use of ML tools to generate predictive algorithms will change the face of clinical practice in hepatology and transplantation. This review will provide readers with the opportunity to learn about the ML tools available and potential applications to questions of interest in hepatology.
Collapse
Affiliation(s)
- Ashley Spann
- Division of Gastroenterology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | | | - Justin Kang
- Multi Organ Transplant Program, Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
| | - Kymberly Watt
- Division of Gastroenterology, Mayo Clinic, Rochester, MN
| | - Bo Wang
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Anna Goldenberg
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Mamatha Bhat
- Multi Organ Transplant Program, Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada.,Division of Gastroenterology, Department of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|