1
|
Sheng B, Pushpanathan K, Guan Z, Lim QH, Lim ZW, Yew SME, Goh JHL, Bee YM, Sabanayagam C, Sevdalis N, Lim CC, Lim CT, Shaw J, Jia W, Ekinci EI, Simó R, Lim LL, Li H, Tham YC. Artificial intelligence for diabetes care: current and future prospects. Lancet Diabetes Endocrinol 2024; 12:569-595. [PMID: 39054035 DOI: 10.1016/s2213-8587(24)00154-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 03/28/2024] [Accepted: 05/16/2024] [Indexed: 07/27/2024]
Abstract
Artificial intelligence (AI) use in diabetes care is increasingly being explored to personalise care for people with diabetes and adapt treatments for complex presentations. However, the rapid advancement of AI also introduces challenges such as potential biases, ethical considerations, and implementation challenges in ensuring that its deployment is equitable. Ensuring inclusive and ethical developments of AI technology can empower both health-care providers and people with diabetes in managing the condition. In this Review, we explore and summarise the current and future prospects of AI across the diabetes care continuum, from enhancing screening and diagnosis to optimising treatment and predicting and managing complications.
Collapse
Affiliation(s)
- Bin Sheng
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai, China; Key Laboratory of Artificial Intelligence, Ministry of Education, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Krithi Pushpanathan
- Centre of Innovation and Precision Eye Health, Department of Ophthalmology, National University of Singapore, Singapore; Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Zhouyu Guan
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai, China
| | - Quan Hziung Lim
- Department of Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Zhi Wei Lim
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Samantha Min Er Yew
- Centre of Innovation and Precision Eye Health, Department of Ophthalmology, National University of Singapore, Singapore; Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | | | - Yong Mong Bee
- Department of Endocrinology, Singapore General Hospital, Singapore; SingHealth Duke-National University of Singapore Diabetes Centre, Singapore Health Services, Singapore
| | - Charumathi Sabanayagam
- Ophthalmology and Visual Sciences Academic Clinical Program, Duke-National University of Singapore Medical School, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
| | - Nick Sevdalis
- Centre for Behavioural and Implementation Science Interventions, National University of Singapore, Singapore
| | | | - Chwee Teck Lim
- Department of Biomedical Engineering, National University of Singapore, Singapore; Institute for Health Innovation and Technology, National University of Singapore, Singapore; Mechanobiology Institute, National University of Singapore, Singapore
| | - Jonathan Shaw
- Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Weiping Jia
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai, China
| | - Elif Ilhan Ekinci
- Australian Centre for Accelerating Diabetes Innovations, Melbourne Medical School and Department of Medicine, University of Melbourne, Melbourne, VIC, Australia; Department of Endocrinology, Austin Health, Melbourne, VIC, Australia
| | - Rafael Simó
- Diabetes and Metabolism Research Unit, Vall d'Hebron University Hospital and Vall d'Hebron Research Institute, Barcelona, Spain; Centro de Investigación Biomédica en Red de Diabetes y Enfermedades Metabólicas Asociadas, Instituto de Salud Carlos III, Madrid, Spain
| | - Lee-Ling Lim
- Department of Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia; Department of Medicine and Therapeutics, Chinese University of Hong Kong, Hong Kong Special Administrative Region, China; Asia Diabetes Foundation, Hong Kong Special Administrative Region, China
| | - Huating Li
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai, China.
| | - Yih-Chung Tham
- Centre of Innovation and Precision Eye Health, Department of Ophthalmology, National University of Singapore, Singapore; Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Ophthalmology and Visual Sciences Academic Clinical Program, Duke-National University of Singapore Medical School, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.
| |
Collapse
|
2
|
Danilov SD, Matveev GA, Babenko AY, Shlyakhto EV. Model for Predicting the Effect of Sibutramine Therapy in Obesity. J Pers Med 2024; 14:811. [PMID: 39202003 PMCID: PMC11355587 DOI: 10.3390/jpm14080811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 07/27/2024] [Accepted: 07/28/2024] [Indexed: 09/03/2024] Open
Abstract
Background: The development of models predicting response to weight loss therapy using sibutramine is found in only a few cases. The objective of the work is to develop a data-driven method of personalized recommendation for obesity treatment that would predict the response to sibutramine based on the current set of patient parameters. Methods: The decision system is built on the XGBoost classification algorithm along with recursive feature selection and Shapley data valuation. Using the results of clinical trials, it was trained to estimate the probability of overcoming a weight loss threshold. The model was evaluated by the accuracy metric using the Leave-One-Out cross-validation. Results: The model for predicting response to sibutramine treatment over 3 months has an accuracy of 71%. The model for predicting outcomes at the sixth month visit based on results at 3 months has an accuracy of 80%. Conclusions: Although our developed prediction model may not exhibit high precision compared to certain benchmarks, it significantly outperforms random chance or models relying only on BMI parameters. Our model used the available range of laboratory tests, which makes it possible to use this model for routine clinical use and help doctors decide whether to prescribe sibutramine.
Collapse
Affiliation(s)
| | - Georgiy A. Matveev
- Laboratory of Prediabetes and Metabolic Disorders, WCRC “Centre for Personalized Medicine”, Almazov National Medical Research Centre, Saint Petersburg 197341, Russia; (S.D.D.); (A.Y.B.)
| | | | | |
Collapse
|
3
|
Liu H, Dong S, Yang H, Wang L, Liu J, Du Y, Liu J, Lyu Z, Wang Y, Jiang L, Yu S, Fu X. Comparing the accuracy of four machine learning models in predicting type 2 diabetes onset within the Chinese population: a retrospective study. J Int Med Res 2024; 52:3000605241253786. [PMID: 38870271 PMCID: PMC11179491 DOI: 10.1177/03000605241253786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 04/23/2024] [Indexed: 06/15/2024] Open
Abstract
OBJECTIVE To evaluate the effectiveness of machine learning (ML) models in predicting 5-year type 2 diabetes mellitus (T2DM) risk within the Chinese population by retrospectively analyzing annual health checkup records. METHODS We included 46,247 patients (32,372 and 13,875 in training and validation sets, respectively) from a national health checkup center database. Univariate and multivariate Cox analyses were performed to identify factors influencing T2DM risk. Extreme Gradient Boosting (XGBoost), support vector machine (SVM), logistic regression (LR), and random forest (RF) models were trained to predict 5-year T2DM risk. Model performances were analyzed using receiver operating characteristic (ROC) curves for discrimination and calibration plots for prediction accuracy. RESULTS Key variables included fasting plasma glucose, age, and sedentary time. The LR model showed good accuracy with respective areas under the ROC (AUCs) of 0.914 and 0.913 in training and validation sets; the RF model exhibited favorable AUCs of 0.998 and 0.838. In calibration analysis, the LR model displayed good fit for low-risk patients; the RF model exhibited satisfactory fit for low- and high-risk patients. CONCLUSIONS LR and RF models can effectively predict T2DM risk in the Chinese population. These models may help identify high-risk patients and guide interventions to prevent complications and disabilities.
Collapse
Affiliation(s)
- Hongzhou Liu
- Department of Endocrinology, Aerospace Center Hospital, Beijing, China
- Department of Endocrinology, First Hospital of Handan City, Handan, China
| | - Song Dong
- Department of Endocrinology, Aerospace Center Hospital, Beijing, China
| | - Hua Yang
- Department of Outpatient, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Linlin Wang
- Department of Endocrinology, Aerospace Center Hospital, Beijing, China
| | - Jia Liu
- Department of Endocrinology, Aerospace Center Hospital, Beijing, China
| | - Yangfan Du
- Department of Endocrinology, Aerospace Center Hospital, Beijing, China
| | - Jing Liu
- Clinics of Cadre, Department of Outpatient, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Zhaohui Lyu
- Department of Endocrinology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Yuhan Wang
- Department of Endocrinology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Li Jiang
- Department of Endocrinology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Shasha Yu
- Department of Endocrinology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Xiaomin Fu
- Clinics of Cadre, Department of Outpatient, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
4
|
Bernstorff M, Hansen L, Enevoldsen K, Damgaard J, Hæstrup F, Perfalk E, Danielsen AA, Østergaard SD. Development and validation of a machine learning model for prediction of type 2 diabetes in patients with mental illness. Acta Psychiatr Scand 2024. [PMID: 38575118 DOI: 10.1111/acps.13687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/08/2024] [Accepted: 03/28/2024] [Indexed: 04/06/2024]
Abstract
BACKGROUND Type 2 diabetes (T2D) is approximately twice as common among individuals with mental illness compared with the background population, but may be prevented by early intervention on lifestyle, diet, or pharmacologically. Such prevention relies on identification of those at elevated risk (prediction). The aim of this study was to develop and validate a machine learning model for prediction of T2D among patients with mental illness. METHODS The study was based on routine clinical data from electronic health records from the psychiatric services of the Central Denmark Region. A total of 74,880 patients with 1.59 million psychiatric service contacts were included in the analyses. We created 1343 potential predictors from 51 source variables, covering patient-level information on demographics, diagnoses, pharmacological treatment, and laboratory results. T2D was operationalised as HbA1c ≥48 mmol/mol, fasting plasma glucose ≥7.0 mmol/mol, oral glucose tolerance test ≥11.1 mmol/mol or random plasma glucose ≥11.1 mmol/mol. Two machine learning models (XGBoost and regularised logistic regression) were trained to predict T2D based on 85% of the included contacts. The predictive performance of the best performing model was tested on the remaining 15% of the contacts. RESULTS The XGBoost model detected patients at high risk 2.7 years before T2D, achieving an area under the receiver operating characteristic curve of 0.84. Of the 996 patients developing T2D in the test set, the model issued at least one positive prediction for 305 (31%). CONCLUSION A machine learning model can accurately predict development of T2D among patients with mental illness based on routine clinical data from electronic health records. A decision support system based on such a model may inform measures to prevent development of T2D in this high-risk population.
Collapse
Affiliation(s)
- Martin Bernstorff
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Lasse Hansen
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Kenneth Enevoldsen
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Jakob Damgaard
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Frida Hæstrup
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Erik Perfalk
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Andreas Aalkjær Danielsen
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Søren Dinesen Østergaard
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| |
Collapse
|
5
|
Mohsen F, Al-Absi HRH, Yousri NA, El Hajj N, Shah Z. A scoping review of artificial intelligence-based methods for diabetes risk prediction. NPJ Digit Med 2023; 6:197. [PMID: 37880301 PMCID: PMC10600138 DOI: 10.1038/s41746-023-00933-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 09/25/2023] [Indexed: 10/27/2023] Open
Abstract
The increasing prevalence of type 2 diabetes mellitus (T2DM) and its associated health complications highlight the need to develop predictive models for early diagnosis and intervention. While many artificial intelligence (AI) models for T2DM risk prediction have emerged, a comprehensive review of their advancements and challenges is currently lacking. This scoping review maps out the existing literature on AI-based models for T2DM prediction, adhering to the PRISMA extension for Scoping Reviews guidelines. A systematic search of longitudinal studies was conducted across four databases, including PubMed, Scopus, IEEE-Xplore, and Google Scholar. Forty studies that met our inclusion criteria were reviewed. Classical machine learning (ML) models dominated these studies, with electronic health records (EHR) being the predominant data modality, followed by multi-omics, while medical imaging was the least utilized. Most studies employed unimodal AI models, with only ten adopting multimodal approaches. Both unimodal and multimodal models showed promising results, with the latter being superior. Almost all studies performed internal validation, but only five conducted external validation. Most studies utilized the area under the curve (AUC) for discrimination measures. Notably, only five studies provided insights into the calibration of their models. Half of the studies used interpretability methods to identify key risk predictors revealed by their models. Although a minority highlighted novel risk predictors, the majority reported commonly known ones. Our review provides valuable insights into the current state and limitations of AI-based models for T2DM prediction and highlights the challenges associated with their development and clinical integration.
Collapse
Affiliation(s)
- Farida Mohsen
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Hamada R H Al-Absi
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Noha A Yousri
- Genetic Medicine, Weill Cornell Medicine-Qatar, Qatar Foundation, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
- Computer and Systems Engineering, Alexandria University, Alexandria, Egypt
| | - Nady El Hajj
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Zubair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar.
| |
Collapse
|
6
|
Chellappan D, Rajaguru H. Enhancement of Classifier Performance Using Swarm Intelligence in Detection of Diabetes from Pancreatic Microarray Gene Data. Biomimetics (Basel) 2023; 8:503. [PMID: 37887634 PMCID: PMC10604158 DOI: 10.3390/biomimetics8060503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/08/2023] [Accepted: 10/20/2023] [Indexed: 10/28/2023] Open
Abstract
In this study, we focused on using microarray gene data from pancreatic sources to detect diabetes mellitus. Dimensionality reduction (DR) techniques were used to reduce the dimensionally high microarray gene data. DR methods like the Bessel function, Discrete Cosine Transform (DCT), Least Squares Linear Regression (LSLR), and Artificial Algae Algorithm (AAA) are used. Subsequently, we applied meta-heuristic algorithms like the Dragonfly Optimization Algorithm (DOA) and Elephant Herding Optimization Algorithm (EHO) for feature selection. Classifiers such as Nonlinear Regression (NLR), Linear Regression (LR), Gaussian Mixture Model (GMM), Expectation Maximum (EM), Bayesian Linear Discriminant Classifier (BLDC), Logistic Regression (LoR), Softmax Discriminant Classifier (SDC), and Support Vector Machine (SVM) with three types of kernels, Linear, Polynomial, and Radial Basis Function (RBF), were utilized to detect diabetes. The classifier's performance was analyzed based on parameters like accuracy, F1 score, MCC, error rate, FM metric, and Kappa. Without feature selection, the SVM (RBF) classifier achieved a high accuracy of 90% using the AAA DR methods. The SVM (RBF) classifier using the AAA DR method for EHO feature selection outperformed the other classifiers with an accuracy of 95.714%. This improvement in the accuracy of the classifier's performance emphasizes the role of feature selection methods.
Collapse
Affiliation(s)
- Dinesh Chellappan
- Department of Electrical and Electronics Engineering, KPR Institute of Engineering and Technology, Coimbatore 641 407, Tamil Nadu, India;
| | - Harikumar Rajaguru
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam 638 401, Tamil Nadu, India
| |
Collapse
|
7
|
Guan Z, Li H, Liu R, Cai C, Liu Y, Li J, Wang X, Huang S, Wu L, Liu D, Yu S, Wang Z, Shu J, Hou X, Yang X, Jia W, Sheng B. Artificial intelligence in diabetes management: Advancements, opportunities, and challenges. Cell Rep Med 2023; 4:101213. [PMID: 37788667 PMCID: PMC10591058 DOI: 10.1016/j.xcrm.2023.101213] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 08/07/2023] [Accepted: 09/08/2023] [Indexed: 10/05/2023]
Abstract
The increasing prevalence of diabetes, high avoidable morbidity and mortality due to diabetes and diabetic complications, and related substantial economic burden make diabetes a significant health challenge worldwide. A shortage of diabetes specialists, uneven distribution of medical resources, low adherence to medications, and improper self-management contribute to poor glycemic control in patients with diabetes. Recent advancements in digital health technologies, especially artificial intelligence (AI), provide a significant opportunity to achieve better efficiency in diabetes care, which may diminish the increase in diabetes-related health-care expenditures. Here, we review the recent progress in the application of AI in the management of diabetes and then discuss the opportunities and challenges of AI application in clinical practice. Furthermore, we explore the possibility of combining and expanding upon existing digital health technologies to develop an AI-assisted digital health-care ecosystem that includes the prevention and management of diabetes.
Collapse
Affiliation(s)
- Zhouyu Guan
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Huating Li
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Ruhan Liu
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China; MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; National Engineering Research Center of Personalized Diagnostic and Therapeutic Technology, Furong Laboratory, Changsha, Hunan 41000, China
| | - Chun Cai
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Yuexing Liu
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Jiajia Li
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China; MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiangning Wang
- Department of Ophthalmology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Shan Huang
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China; MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Liang Wu
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Dan Liu
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Shujie Yu
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Zheyuan Wang
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China; MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jia Shu
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China; MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xuhong Hou
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Xiaokang Yang
- MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Weiping Jia
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China.
| | - Bin Sheng
- Shanghai International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China; MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.
| |
Collapse
|
8
|
Naveed I, Kaleem MF, Keshavjee K, Guergachi A. Artificial intelligence with temporal features outperforms machine learning in predicting diabetes. PLOS DIGITAL HEALTH 2023; 2:e0000354. [PMID: 37878561 PMCID: PMC10599553 DOI: 10.1371/journal.pdig.0000354] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 08/19/2023] [Indexed: 10/27/2023]
Abstract
Diabetes mellitus type 2 is increasingly being called a modern preventable pandemic, as even with excellent available treatments, the rate of complications of diabetes is rapidly increasing. Predicting diabetes and identifying it in its early stages could make it easier to prevent, allowing enough time to implement therapies before it gets out of control. Leveraging longitudinal electronic medical record (EMR) data with deep learning has great potential for diabetes prediction. This paper examines the predictive competency of deep learning models in contrast to state-of-the-art machine learning models to incorporate the time dimension of risk. The proposed research investigates a variety of deep learning models and features for predicting diabetes. Model performance was appraised and compared in relation to predominant features, risk factors, training data density and visit history. The framework was implemented on the longitudinal EMR records of over 19K patients extracted from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN). Empirical findings demonstrate that deep learning models consistently outperform other state-of-the-art competitors with prediction accuracy of above 91%, without overfitting. Fasting blood sugar, hemoglobin A1c and body mass index are the key predictors of future onset of diabetes. Overweight, middle aged patients and patients with hypertension are more vulnerable to developing diabetes, consistent with what is already known. Model performance improves as training data density or the visit history of a patient increases. This study confirms the ability of the LSTM deep learning model to incorporate the time dimension of risk in its predictive capabilities.
Collapse
Affiliation(s)
- Iqra Naveed
- Department of Electrical Engineering, University of Management and Technology, Lahore, Pakistan
| | - Muhammad Farhat Kaleem
- Department of Electrical Engineering, University of Management and Technology, Lahore, Pakistan
| | - Karim Keshavjee
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada
| | - Aziz Guergachi
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada
- Ted Rogers School of Information Technology Management, Toronto Metropolitan University, Toronto, Canada
- Department of Mathematics and Statistics, York University, Toronto, Canada
| |
Collapse
|
9
|
Kononova Y, Abramyan L, Derevitskii I, Babenko A. Predictors of Carbohydrate Metabolism Disorders and Lethal Outcome in Patients after Myocardial Infarction: A Place of Glucose Level. J Pers Med 2023; 13:997. [PMID: 37373986 PMCID: PMC10305089 DOI: 10.3390/jpm13060997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/29/2023] [Accepted: 06/05/2023] [Indexed: 06/29/2023] Open
Abstract
BACKGROUND AND AIM The aim of this study was to reveal statistical patterns in patients with acute myocardial infarction (AMI) that cause the development of carbohydrate metabolism disorders (CMD) (type 2 diabetes mellitus and prediabetes) and death within 5 years after AMI. METHODS 1079 patients who were treated with AMI in the Almazov National Medical Research Center were retrospectively selected for the study. For each patient, all data from electronic medical records were downloaded. Statistical patterns that determine the development of CMDs and death within 5 years after AMI were identified. To create and train the models used in this study, the classic methods of Data Mining, Data Exploratory Analysis, and Machine Learning were used. RESULTS The main predictors of mortality within 5 years after AMI were advanced age, low relative level of lymphocytes, circumflex artery lesion, and glucose level. Main predictors of CMDs were low basophils, high neutrophils, high platelet distribution width, and high blood glucose level. High values of age and glucose together were relatively independent predictors. With glucose level >11 mmol/L and age >70 years, the 5-year risk of death is about 40% and it rises with increasing glucose levels. CONCLUSION The obtained results make it possible to predict the development of CMDs and death based on simple parameters that are easily available in clinical practice. Glucose level measured on the 1st day of AMI was among the most important predictors of CMDs and death.
Collapse
Affiliation(s)
- Yulia Kononova
- World-Class Research Centre for Personalized Medicine, Almazov National Medical Research Centre, 197341 St. Petersburg, Russia
| | | | | | | |
Collapse
|
10
|
Choi BG, Park JY, Rha SW, Noh YK. Pre-test probability for coronary artery disease in patients with chest pain based on machine learning techniques. Int J Cardiol 2023:S0167-5273(23)00734-9. [PMID: 37230426 DOI: 10.1016/j.ijcard.2023.05.041] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 05/15/2023] [Accepted: 05/21/2023] [Indexed: 05/27/2023]
Abstract
BACKGROUND A correct and prompt diagnosis of coronary artery disease (CAD) is a crucial component of disease management to reduce the risk of death and improve the quality of life in patients with CAD. Currently, the American College of Cardiology (ACC)/American Heart Association (AHA) and the European Society of Cardiology (ESC) guidelines recommend selecting an appropriate pre-diagnosis test for an individual patient according to the CAD probability. The purpose of this study was to develop a practical pre-test probability (PTP) for obstructive CAD in patients with chest pain using machine learning (ML); also, the performance of ML-PTP for CAD is compared to the final result of coronary angiography (CAG). METHODS We used a database from a single-center, prospective, all-comer registry designed to reflect real-world practice since 2004. All subjects underwent invasive CAG at Korea University Guro Hospital in Seoul, South Korea. We used logistic regression algorithms, random forest (RF), supporting vector machine, and K-nearest neighbor classification for the ML models. The dataset was divided into two consecutive sets according to the registration period to validate the ML models. ML training for PTP and internal validation used the first dataset registered between 2004 and 2012 (8631 patients). The second dataset registered between 2013 and 2014 (1546 patients) was used for external validation. The primary endpoint was obstructive CAD. Obstructive CAD was defined as having a stenosis diameter of >70% on the quantitative CAG of the main epicardial coronary artery. RESULTS We derived an ML-based model consisting of three different models according to the subject used to obtain the information, such as the patient himself (dataset 1), the community's first medical center (dataset 2), and doctors (dataset 3). The performance range of the ML-PTP models as the non-invasive test had C-statistics of 0.795 to 0.984 compared to the result of invasive testing via CAG in patients with chest pain. The training ML-PTP models were adjusted to have 99% sensitivity for CAD so as not to miss actual CAD patients. In the testing dataset, the best accuracy of the ML-PTP model was 45.7% using dataset 1, 47.2% using dataset 2, and 92.8% using dataset 3 and the RF algorithm. The CAD prediction sensitivity was 99.0%, 99.0%, and 98.0%, respectively. CONCLUSION We successfully developed a high-performance model of ML-PTP for CAD which is expected to reduce the need for non-invasive tests in chest pain. However, since this PTP model is derived from data of a single medical center, multicenter verification is required to use it as a PTP recommended by the major American societies and the ESC.
Collapse
Affiliation(s)
- Byoung Geol Choi
- Department of Computer Science, Hanyang University, Seoul 04763, Republic of Korea; Cardiovascular Center, Korea University Guro Hospital, Seoul 08308, Republic of Korea
| | - Ji Young Park
- Division of Cardiology, Nowon Eulji Medical Center, Eulji University, Seoul 01830, Republic of Korea
| | - Seung-Woon Rha
- Cardiovascular Center, Korea University Guro Hospital, Seoul 08308, Republic of Korea; Cardiovascular Center, Korea University Guro Hospital, Seoul 08308, Republic of Korea.
| | - Yung-Kyun Noh
- Department of Computer Science, Hanyang University, Seoul 04763, Republic of Korea; School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Republic of Korea.
| |
Collapse
|
11
|
Datta S, Morassi Sasso A, Kiwit N, Bose S, Nadkarni G, Miotto R, Böttinger EP. Predicting hypertension onset from longitudinal electronic health records with deep learning. JAMIA Open 2022; 5:ooac097. [PMID: 36448021 PMCID: PMC9696747 DOI: 10.1093/jamiaopen/ooac097] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 10/26/2022] [Accepted: 11/07/2022] [Indexed: 04/14/2024] Open
Abstract
Objective Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. Materials and Methods We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A "train and validation") using cross-validation, and then applied the models to a second dataset (dataset B "test") to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. Results With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the "train and validation" dataset A and 0.94 in the "test" dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. Conclusion These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension.
Collapse
Affiliation(s)
- Suparno Datta
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Ariane Morassi Sasso
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Nina Kiwit
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
| | - Subhronil Bose
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
| | - Girish Nadkarni
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Riccardo Miotto
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Erwin P Böttinger
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
12
|
De D, Nayak T, Chowdhury S, Dhal PK. Insights of Host Physiological Parameters and Gut Microbiome of Indian Type 2 Diabetic Patients Visualized via Metagenomics and Machine Learning Approaches. Front Microbiol 2022; 13:914124. [PMID: 35923393 PMCID: PMC9340226 DOI: 10.3389/fmicb.2022.914124] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
Type 2 diabetes (T2D) is a serious public health issue and may also contribute to modification in the structure of the intestinal microbiota, implying a link between T2D and microbial inhabitants in the digestive tract. This work aimed to develop efficient models for identifying essential physiological markers for improved T2D classification using machine learning algorithms. Using amplicon metagenomic approaches, an effort has also been made to understand the alterations in core gut microbial members in Indian T2D patients with respect to their control normal glucose tolerance (NGT). Our data indicate the level of fasting blood glucose (FBG) and glycated hemoglobin (HbA1c) were the most useful physiological indicators while random forest and support vector machine with RBF Kernel were effective predictions models for identifications of T2D. The dominating gut microbial members Allopreotella, Rikenellaceae RC9 gut group, Haemophilus, Ruminococcus torques group, etc. in Indian T2D patients showed a strong association with both FBG and HbA1c. These members have been reported to have a crucial role in gut barrier breakdown, blood glucose, and lipopolysaccharide level escalation, or as biomarkers. While the dominant NGT microbiota (Akkermansia, Ligilactobacillus, Enterobacter, etc.) in the colon has been shown to influence inflammatory immune responses by acting as an anti-inflammatory agent and maintaining the gut barrier. The topology study of co-occurrence network analysis indicates that changes in network complexity in T2D lead to variations in the different gut microbial members compared to NGT. These studies provide a better understanding of the gut microbial diversity in Indian T2D patients and show the way for the development of valuable diagnostics strategies to improve the prediction and modulation of the T2D along with already established methods.
Collapse
Affiliation(s)
- Debjit De
- Department of Life Science and Biotechnology, Jadavpur University, Kolkata, India
| | - Tilak Nayak
- Department of Life Science and Biotechnology, Jadavpur University, Kolkata, India
| | - Subhankar Chowdhury
- Department of Endocrinology, Institute of Post Graduate Medical Education and Research (IPGMER) and SSKM Hospital, Kolkata, India
| | - Paltu Kumar Dhal
- Department of Life Science and Biotechnology, Jadavpur University, Kolkata, India
- *Correspondence: Paltu Kumar Dhal
| |
Collapse
|
13
|
Padhy S, Dash S, Routray S, Ahmad S, Nazeer J, Alam A. IoT-Based Hybrid Ensemble Machine Learning Model for Efficient Diabetes Mellitus Prediction. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2389636. [PMID: 35634091 PMCID: PMC9132636 DOI: 10.1155/2022/2389636] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 04/25/2022] [Accepted: 04/30/2022] [Indexed: 12/11/2022]
Abstract
Nowadays, there is a growing need for Internet of Things (IoT)-based mobile healthcare applications that help to predict diseases. In recent years, several people have been diagnosed with diabetes, and according to World Health Organization (WHO), diabetes affects 346 million individuals worldwide. Therefore, we propose a noninvasive self-care system based on the IoT and machine learning (ML) that analyses blood sugar and other key indicators to predict diabetes early. The main purpose of this work is to develop enhanced diabetes management applications which help in patient monitoring and technology-assisted decision-making. The proposed hybrid ensemble ML model predicts diabetes mellitus by combining both bagging and boosting methods. An online IoT-based application and offline questionnaire with 15 questions about health, family history, and lifestyle were used to recruit a total of 10221 people for the study. For both datasets, the experimental findings suggest that our proposed model outperforms state-of-the-art techniques.
Collapse
Affiliation(s)
- Sasmita Padhy
- School of Computing Science and Engineering, VIT Bhopal University, Bhopal, Madhya Pradesh, India
| | - Sachikanta Dash
- Department of Computer Science and Engineering, GIET University, Gunupur, Odisha, India
| | - Sidheswar Routray
- Department of Computer Science and Engineering, School of Engineering, Indrashil University, Rajpur, Mehsana, Gujarat, India
| | - Sultan Ahmad
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Alkharj 11942, Saudi Arabia
| | - Jabeen Nazeer
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Alkharj 11942, Saudi Arabia
| | - Afroj Alam
- Department of Computer Science, Bakhtar University, Kabul, Afghanistan
| |
Collapse
|
14
|
Kodama S, Fujihara K, Horikawa C, Kitazawa M, Iwanaga M, Kato K, Watanabe K, Nakagawa Y, Matsuzaka T, Shimano H, Sone H. Predictive ability of current machine learning algorithms for type 2 diabetes mellitus: A meta-analysis. J Diabetes Investig 2022; 13:900-908. [PMID: 34942059 PMCID: PMC9077721 DOI: 10.1111/jdi.13736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 12/09/2021] [Accepted: 12/13/2021] [Indexed: 11/22/2022] Open
Abstract
AIMS/INTRODUCTION Recently, an increasing number of cohort studies have suggested using machine learning (ML) to predict type 2 diabetes mellitus. However, its predictive ability remains inconclusive. This meta-analysis evaluated the current ability of ML algorithms for predicting incident type 2 diabetes mellitus. MATERIALS AND METHODS We systematically searched longitudinal studies published from 1 January 1950 to 17 May 2020 using MEDLINE and EMBASE. Included studies had to compare ML's classification with the actual incidence of type 2 diabetes mellitus, and present data on the number of true positives, false positives, true negatives and false negatives. The dataset for these four values was pooled with a hierarchical summary receiver operating characteristic and a bivariate random effects model. RESULTS There were 12 eligible studies. The pooled sensitivity, specificity, positive likelihood ratio and negative likelihood ratio were 0.81 (95% confidence interval [CI] 0.67-0.90), 0.82 [95% CI 0.74-0.88], 4.55 [95% CI 3.07-6.75] and 0.23 [95% CI 0.13-0.42], respectively. The area under the summarized receiver operating characteristic curve was 0.88 (95% CI 0.85-0.91). CONCLUSIONS Current ML algorithms have sufficient ability to help clinicians determine whether individuals will develop type 2 diabetes mellitus in the future. However, persons should be cautious before changing their attitude toward future diabetes risk after learning the result of the diabetes prediction test using ML algorithms.
Collapse
Affiliation(s)
- Satoru Kodama
- Department of Prevention of Noncommunicable Diseases and Promotion of Health CheckupNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
- Department of Hematology, Endocrinology and MetabolismNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| | - Kazuya Fujihara
- Department of Hematology, Endocrinology and MetabolismNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| | - Chika Horikawa
- Department of Health and NutritionFaculty of Human Life StudiesUniversity of Niigata PrefectureNiigataJapan
| | - Masaru Kitazawa
- Department of Prevention of Noncommunicable Diseases and Promotion of Health CheckupNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| | - Midori Iwanaga
- Department of Prevention of Noncommunicable Diseases and Promotion of Health CheckupNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
- Department of Hematology, Endocrinology and MetabolismNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| | - Kiminori Kato
- Department of Prevention of Noncommunicable Diseases and Promotion of Health CheckupNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
- Department of Hematology, Endocrinology and MetabolismNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| | - Kenichi Watanabe
- Department of Prevention of Noncommunicable Diseases and Promotion of Health CheckupNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
- Department of Hematology, Endocrinology and MetabolismNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| | - Yoshimi Nakagawa
- Division of Complex Biosystem ResearchInstitute of Natural MedicineToyama UniversityToyamaJapan
| | - Takashi Matsuzaka
- Department of Internal Medicine (Endocrinology and Metabolism)Faculty of MedicineUniversity of TsukubaIbarakiJapan
| | - Hitoshi Shimano
- Department of Internal Medicine (Endocrinology and Metabolism)Faculty of MedicineUniversity of TsukubaIbarakiJapan
| | - Hirohito Sone
- Department of Hematology, Endocrinology and MetabolismNiigata University Graduate School of Medical and Dental SciencesNiigataJapan
| |
Collapse
|
15
|
Wang D, Willis DR, Yih Y. The pneumonia severity index: Assessment and comparison to popular machine learning classifiers. Int J Med Inform 2022; 163:104778. [PMID: 35487075 DOI: 10.1016/j.ijmedinf.2022.104778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 04/18/2022] [Accepted: 04/20/2022] [Indexed: 10/18/2022]
Abstract
INTRODUCTION Pneumonia is the top communicable cause of death worldwide. Accurate prognostication of patient severity with Community Acquired Pneumonia (CAP) allows better patient care and hospital management. The Pneumonia Severity Index (PSI) was developed in 1997 as a tool to guide clinical practice by stratifying the severity of patients with CAP. While the PSI has been evaluated against other clinical stratification tools, it has not been evaluated against multiple classic machine learning classifiers in various metrics over large sample size. METHODS In this paper, we evaluated and compared the prediction performance of nine classic machine learning classifiers with PSI over 34,720 adult (age 18+) patient records collected from 749 hospitals from 2009 to 2018 in the United States on Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) and Average Precision (Precision-Recall AUC). RESULTS Machine learning classifiers, such as Random Forest, provided a statistically highly(p < 0.001) significant improvement (∼33% in PR AUC and ∼6% in ROC AUC) compared to PSI and required only 7 input values (compared to 20 parameters used in PSI). DISCUSSION Because of its ease of use, PSI remains a very strong clinical decision tool, but machine learning classifiers can provide better prediction accuracy performance. Comparing prediction performance across multiple metrics such as PR AUC, instead of ROC AUC alone can provide additional insight.
Collapse
Affiliation(s)
- Dawei Wang
- School of Industrial Engineering, Purdue University, 315 Grant St, West Lafayette, IN 47907, USA.
| | - Deanna R Willis
- Indiana University School of Medicine, Department of Family Medicine, 1110 W. Michigan St, LO 200, Indianapolis, IN 46202, USA
| | - Yuehwern Yih
- School of Industrial Engineering, Purdue University, 315 Grant St, West Lafayette, IN 47907, USA
| |
Collapse
|
16
|
Tuppad A, Patil SD. Machine learning for diabetes clinical decision support: a review. ADVANCES IN COMPUTATIONAL INTELLIGENCE 2022; 2:22. [PMID: 35434723 PMCID: PMC9006199 DOI: 10.1007/s43674-022-00034-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 02/27/2022] [Accepted: 03/03/2022] [Indexed: 12/14/2022]
Abstract
Type 2 diabetes has recently acquired the status of an epidemic silent killer, though it is non-communicable. There are two main reasons behind this perception of the disease. First, a gradual but exponential growth in the disease prevalence has been witnessed irrespective of age groups, geography or gender. Second, the disease dynamics are very complex in terms of multifactorial risks involved, initial asymptomatic period, different short-term and long-term complications posing serious health threat and related co-morbidities. Majority of its risk factors are lifestyle habits like physical inactivity, lack of exercise, high body mass index (BMI), poor diet, smoking except some inevitable ones like family history of diabetes, ethnic predisposition, ageing etc. Nowadays, machine learning (ML) is increasingly being applied for alleviation of diabetes health burden and many research works have been proposed in the literature to offer clinical decision support in different application areas as well. In this paper, we present a review of such efforts for the prevention and management of type 2 diabetes. Firstly, we present the medical gaps in diabetes knowledge base, guidelines and medical practice identified from relevant articles and highlight those that can be addressed by ML. Further, we review the ML research works in three different application areas namely—(1) risk assessment (statistical risk scores and ML-based risk models), (2) diagnosis (using non-invasive and invasive features), (3) prognosis (from normoglycemia/prior morbidity to incident diabetes and prognosis of incident diabetes to related complications). We discuss and summarize the shortcomings or gaps in the existing ML methodologies for diabetes to be addressed in future. This review provides the breadth of ML predictive modeling applications for diabetes while highlighting the medical and technological gaps as well as various aspects involved in ML-based diabetes clinical decision support.
Collapse
Affiliation(s)
- Ashwini Tuppad
- School of Computer Science and Engineering, REVA University, Rukmini Knowledge Park, Kattigenahalli, Bangalore, Karnataka India
| | - Shantala Devi Patil
- School of Computer Science and Engineering, REVA University, Rukmini Knowledge Park, Kattigenahalli, Bangalore, Karnataka India
| |
Collapse
|
17
|
Anti-Diabetic Effects of Ethanol Extract from Sanghuangporous vaninii in High-Fat/Sucrose Diet and Streptozotocin-Induced Diabetic Mice by Modulating Gut Microbiota. Foods 2022; 11:foods11070974. [PMID: 35407061 PMCID: PMC8997417 DOI: 10.3390/foods11070974] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/21/2022] [Accepted: 03/24/2022] [Indexed: 01/27/2023] Open
Abstract
Type 2 diabetes mellitus (T2DM) may lead to abnormally elevated blood glucose, lipid metabolism disorder, and low-grade inflammation. Besides, the development of T2DM is always accompanied by gut microbiota dysbiosis and metabolic dysfunction. In this study, the T2DM mice model was established by feeding a high-fat/sucrose diet combined with injecting a low dose of streptozotocin. Additionally, the effects of oral administration of ethanol extract from Sanghuangporous vaninii (SVE) on T2DM and its complications (including hypoglycemia, hyperlipidemia, inflammation, and gut microbiota dysbiosis) were investigated. The results showed SVE could improve body weight, glycolipid metabolism, and inflammation-related parameters. Besides, SVE intervention effectively ameliorated the diabetes-induced pancreas and jejunum injury. Furthermore, SVE intervention significantly increased the relative abundances of Akkermansia, Dubosiella, Bacteroides, and Parabacteroides, and decreased the levels of Lactobacillus, Flavonifractor, Odoribacter, and Desulfovibrio compared to the model group (LDA > 3.0, p < 0.05). Metabolic function prediction of the intestinal microbiota by PICRUSt revealed that glycerolipid metabolism, insulin signaling pathway, PI3K-Akt signaling pathway, and fatty acid degradation were enriched in the diabetic mice treated with SVE. Moreover, the integrative analysis indicated that the key intestinal microbial phylotypes in response to SVE intervention were strongly correlated with glucose and lipid metabolism-associated biochemical parameters. These findings demonstrated that SVE has the potential to alleviate T2DM and its complications by modulating the gut microbiota imbalance.
Collapse
|
18
|
Fregoso-Aparicio L, Noguez J, Montesinos L, García-García JA. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetol Metab Syndr 2021; 13:148. [PMID: 34930452 PMCID: PMC8686642 DOI: 10.1186/s13098-021-00767-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/07/2021] [Indexed: 12/12/2022] Open
Abstract
Diabetes Mellitus is a severe, chronic disease that occurs when blood glucose levels rise above certain limits. Over the last years, machine and deep learning techniques have been used to predict diabetes and its complications. However, researchers and developers still face two main challenges when building type 2 diabetes predictive models. First, there is considerable heterogeneity in previous studies regarding techniques used, making it challenging to identify the optimal one. Second, there is a lack of transparency about the features used in the models, which reduces their interpretability. This systematic review aimed at providing answers to the above challenges. The review followed the PRISMA methodology primarily, enriched with the one proposed by Keele and Durham Universities. Ninety studies were included, and the type of model, complementary techniques, dataset, and performance parameters reported were extracted. Eighteen different types of models were compared, with tree-based algorithms showing top performances. Deep Neural Networks proved suboptimal, despite their ability to deal with big and dirty data. Balancing data and feature selection techniques proved helpful to increase the model's efficiency. Models trained on tidy datasets achieved almost perfect models.
Collapse
Affiliation(s)
- Luis Fregoso-Aparicio
- School of Engineering and Sciences, Tecnologico de Monterrey, Av Lago de Guadalupe KM 3.5, Margarita Maza de Juarez, 52926 Cd Lopez Mateos, Mexico
| | - Julieta Noguez
- School of Engineering and Sciences, Tecnologico de Monterrey, Ave. Eugenio Garza Sada 2501, 64849 Monterrey, Nuevo Leon Mexico
| | - Luis Montesinos
- School of Engineering and Sciences, Tecnologico de Monterrey, Ave. Eugenio Garza Sada 2501, 64849 Monterrey, Nuevo Leon Mexico
| | - José A. García-García
- Hospital General de Mexico Dr. Eduardo Liceaga, Dr. Balmis 148, Doctores, Cuauhtemoc, 06720 Mexico City, Mexico
| |
Collapse
|
19
|
Nomura A, Noguchi M, Kometani M, Furukawa K, Yoneda T. Artificial Intelligence in Current Diabetes Management and Prediction. Curr Diab Rep 2021; 21:61. [PMID: 34902070 PMCID: PMC8668843 DOI: 10.1007/s11892-021-01423-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/13/2021] [Indexed: 10/28/2022]
Abstract
PURPOSE OF REVIEW Artificial intelligence (AI) can make advanced inferences based on a large amount of data. The mainstream technologies of the AI boom in 2021 are machine learning (ML) and deep learning, which have made significant progress due to the increase in computational resources accompanied by the dramatic improvement in computer performance. In this review, we introduce AI/ML-based medical devices and prediction models regarding diabetes. RECENT FINDINGS In the field of diabetes, several AI-/ML-based medical devices and regarding automatic retinal screening, clinical diagnosis support, and patient self-management tool have already been approved by the US Food and Drug Administration. As for new-onset diabetes prediction using ML methods, its performance is not superior to conventional risk stratification models that use statistical approaches so far. Despite the current situation, it is expected that the predictive performance of AI will soon be maximized by a large amount of organized data and abundant computational resources, which will contribute to a dramatic improvement in the accuracy of disease prediction models for diabetes.
Collapse
Affiliation(s)
- Akihiro Nomura
- Department of Biomedical Informatics, CureApp Institute, Karuizawa, Japan.
- Innovative Clinical Research Center, Kanazawa University, 13-1 Takaramachi, Kanazawa, 9208641, Japan.
- Department of Cardiovascular Medicine, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan.
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan.
| | - Masahiro Noguchi
- Department of Cardiovascular Medicine, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
| | - Mitsuhiro Kometani
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
| | - Kenji Furukawa
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
- Health Care Center, Japan Advanced Institute of Science and Technology, Nomi, Japan
| | - Takashi Yoneda
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
| |
Collapse
|
20
|
Helms TM, Köpnick A, Leber A, Zugck C, Steen H, Karle C, Remppis A, Zippel-Schultz B. [Heart failure care in a digitalized future : A discourse on resource-sparing structures and self-determined patients]. Internist (Berl) 2021; 62:1180-1190. [PMID: 34648044 DOI: 10.1007/s00108-021-01173-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2021] [Indexed: 11/29/2022]
Abstract
Digital health solutions, applications of artificial intelligence (AI) and new technologies, such as cardiac magnetic resonance imaging and cardiac human genetics are currently being validated in cardiac healthcare pathways. They show promising approaches for improving existing healthcare structures in the future by strengthening the focus on predictive, preventive and personalized medicine. In addition, the accompanying use of digital health applications will become increasingly more important in the future healthcare, especially in patients with chronic diseases. In this article, the authors describe a case of chronic heart failure (HF) as an example to provide an overview of how digitalized healthcare can be efficiently designed across sectors and disciplines in the future. Moreover, the importance of a self-determined patient management for the treatment process itself is underlined. Since HF is frequently accompanied by various comorbidities during the course of the disease that are often recognized only after a delay, the necessity for a timely simultaneous and preventive treatment of multiple comorbidities in cardiovascular diseases is emphasized. Against this background the currently separately applied disease management programs (DMP) are critically questioned. The development of a holistic DMP encompassing all indications for the treatment of chronic diseases may pave the way to a more efficient medical care system.
Collapse
Affiliation(s)
- Thomas M Helms
- Deutsche Stiftung für chronisch Kranke, Fürth, Deutschland. .,Peri Cor Arbeitsgruppe Kardiologie/Ass. UCSF, Hamburg, Deutschland.
| | - Anne Köpnick
- Deutsche Stiftung für chronisch Kranke, Fürth, Deutschland
| | | | - Christian Zugck
- Kardiologie, Kardiologische Praxis im Steiner Thor, Straubing, Deutschland
| | | | | | - Andrew Remppis
- Herz- und Gefäßzentrum Bad Bevensen, Bad Bevensen, Deutschland
| | | |
Collapse
|
21
|
Lee S, Kim HS. Prospect of Artificial Intelligence Based on Electronic Medical Record. J Lipid Atheroscler 2021; 10:282-290. [PMID: 34621699 PMCID: PMC8473961 DOI: 10.12997/jla.2021.10.3.282] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 06/04/2021] [Accepted: 07/05/2021] [Indexed: 11/23/2022] Open
Abstract
With the advent of the big data era, the interest of the international community is focusing on increasing the utilization of medical big data. Many hospitals are attempting to increase the efficiency of their operations and patient management by adopting artificial intelligence (AI) technology that enables the use of electronic medical record (EMR) data. EMR includes information about a patient's health history, such as diagnoses, medicines, tests, allergies, immunizations, treatment plans, personalized medical care, and improvement of medical quality and safety. EMR data can also be used for AI-based new drug development. In particular, it is effective to develop AI that can predict the occurrence of specific diseases or provide individualized customized treatments by classifying the individualized characteristics of patients. In order to improve performance of artificial intelligence research using EMR data, standardization and refinement of data are essential. In addition, since EMR data deal with sensitive personal information of patients, it is also vital to protect the patient's privacy. There are already various supports for the use of EMR data in the Korean government, and researchers are encouraged to be proactive.
Collapse
Affiliation(s)
- Suehyun Lee
- Department of Biomedical Informatics, College of Medicine, Konyang University, Daejeon, Korea.,Health Care Data Science Center, Konyang University Hospital, Daejeon, Korea
| | - Hun-Sung Kim
- Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea.,Division of Endocrinology and Metabolism, Department of Internal Medicine, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
| |
Collapse
|
22
|
Rhee SY, Sung JM, Kim S, Cho IJ, Lee SE, Chang HJ. Development and Validation of a Deep Learning Based Diabetes Prediction System Using a Nationwide Population-Based Cohort. Diabetes Metab J 2021; 45:515-525. [PMID: 33631067 PMCID: PMC8369223 DOI: 10.4093/dmj.2020.0081] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 08/19/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Previously developed prediction models for type 2 diabetes mellitus (T2DM) have limited performance. We developed a deep learning (DL) based model using a cohort representative of the Korean population. METHODS This study was conducted on the basis of the National Health Insurance Service-Health Screening (NHIS-HEALS) cohort of Korea. Overall, 335,302 subjects without T2DM at baseline were included. We developed the model based on 80% of the subjects, and verified the power in the remainder. Predictive models for T2DM were constructed using the recurrent neural network long short-term memory (RNN-LSTM) network and the Cox longitudinal summary model. The performance of both models over a 10-year period was compared using a time dependent area under the curve. RESULTS During a mean follow-up of 10.4±1.7 years, the mean frequency of periodic health check-ups was 2.9±1.0 per subject. During the observation period, T2DM was newly observed in 8.7% of the subjects. The annual performance of the model created using the RNN-LSTM network was superior to that of the Cox model, and the risk factors for T2DM, derived using the two models were similar; however, certain results differed. CONCLUSION The DL-based T2DM prediction model, constructed using a cohort representative of the population, performs better than the conventional model. After pilot tests, this model will be provided to all Korean national health screening recipients in the future.
Collapse
Affiliation(s)
- Sang Youl Rhee
- Department of Endocrinology and Metabolism, Kyung Hee University School of Medicine, Seoul, Korea
| | - Ji Min Sung
- Integrative Research Center for Cerebrovascular and Cardiovascular diseases, Yonsei University Health System, Yonsei University College of Medicine, Seoul, Korea
| | - Sunhee Kim
- Yonsei University College of Medicine, Yonsei University Health System, Seoul, Korea
| | - In-Jeong Cho
- Division of Cardiology, Ewha Womans University School of Medicine, Seoul, Korea
| | - Sang-Eun Lee
- Division of Cardiology, Severance Cardiovascular Hospital, Yonsei University Health System, Yonsei University College of Medicine, Seoul, Korea
| | - Hyuk-Jae Chang
- Division of Cardiology, Severance Cardiovascular Hospital, Yonsei University Health System, Yonsei University College of Medicine, Seoul, Korea
- Corresponding author: Hyuk-Jae Chang https://orcid.org/0000-0002-6139-7545 Division of Cardiology, Severance Cardiovascular Hospital, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea E-mail:
| |
Collapse
|
23
|
Sajid MR, Muhammad N, Zakaria R, Shahbaz A, Bukhari SAC, Kadry S, Suresh A. Nonclinical Features in Predictive Modeling of Cardiovascular Diseases: A Machine Learning Approach. Interdiscip Sci 2021; 13:201-211. [PMID: 33675528 DOI: 10.1007/s12539-021-00423-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 02/08/2021] [Accepted: 02/20/2021] [Indexed: 12/23/2022]
Abstract
BACKGROUND In the broader healthcare domain, the prediction bears more value than an explanation considering the cost of delays in its services. There are various risk prediction models for cardiovascular diseases (CVDs) in the literature for early risk assessment. However, the substantial increase in CVDs-related mortality is challenging global health systems, especially in developing countries. This situation allows researchers to improve CVDs prediction models using new features and risk computing methods. This study aims to assess nonclinical features that can be easily available in any healthcare systems, in predicting CVDs using advanced and flexible machine learning (ML) algorithms. METHODS A gender-matched case-control study was conducted in the largest public sector cardiac hospital of Pakistan, and the data of 460 subjects were collected. The dataset comprised of eight nonclinical features. Four supervised ML algorithms were used to train and test the models to predict the CVDs status by considering traditional logistic regression (LR) as the baseline model. The models were validated through the train-test split (70:30) and tenfold cross-validation approaches. RESULTS Random forest (RF), a nonlinear ML algorithm, performed better than other ML algorithms and LR. The area under the curve (AUC) of RF was 0.851 and 0.853 in the train-test split and tenfold cross-validation approach, respectively. The nonclinical features yielded an admissible accuracy (minimum 71%) through the LR and ML models, exhibiting its predictive capability in risk estimation. CONCLUSION The satisfactory performance of nonclinical features reveals that these features and flexible computational methodologies can reinforce the existing risk prediction models for better healthcare services.
Collapse
Affiliation(s)
- Mirza Rizwan Sajid
- Centre for Mathematical Sciences, College of Computing and Applied Sciences, Universiti Malaysia Pahang, 26300, Gambang, Kuantan, Pahang Darul Makmur, Malaysia
| | - Noryanti Muhammad
- Centre for Mathematical Sciences, College of Computing and Applied Sciences, Universiti Malaysia Pahang, 26300, Gambang, Kuantan, Pahang Darul Makmur, Malaysia.
| | - Roslinazairimah Zakaria
- Centre for Mathematical Sciences, College of Computing and Applied Sciences, Universiti Malaysia Pahang, 26300, Gambang, Kuantan, Pahang Darul Makmur, Malaysia
| | - Ahmad Shahbaz
- Punjab Institute of Cardiology, Lahore, 54000, Pakistan
| | - Syed Ahmad Chan Bukhari
- Division of Computer Science, Mathematics and Science, Collins College of Professional Studies, St. Johns University, New York, NY, 11439, USA
| | - Seifedine Kadry
- Faculty of Applied Computing and Technology, Noroff University College, Kristiansand, Norway
| | - A Suresh
- Department of Computer Science and Engineering, SRM Institute of Science & Technology, Kattankulathur, Chengalpattu (D.t), 603 203, Tamilnadu, India
| |
Collapse
|
24
|
Ravaut M, Harish V, Sadeghi H, Leung KK, Volkovs M, Kornas K, Watson T, Poutanen T, Rosella LC. Development and Validation of a Machine Learning Model Using Administrative Health Data to Predict Onset of Type 2 Diabetes. JAMA Netw Open 2021; 4:e2111315. [PMID: 34032855 PMCID: PMC8150694 DOI: 10.1001/jamanetworkopen.2021.11315] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 04/01/2021] [Indexed: 11/14/2022] Open
Abstract
Importance Systems-level barriers to diabetes care could be improved with population health planning tools that accurately discriminate between high- and low-risk groups to guide investments and targeted interventions. Objective To develop and validate a population-level machine learning model for predicting type 2 diabetes 5 years before diabetes onset using administrative health data. Design, Setting, and Participants This decision analytical model study used linked administrative health data from the diverse, single-payer health system in Ontario, Canada, between January 1, 2006, and December 31, 2016. A gradient boosting decision tree model was trained on data from 1 657 395 patients, validated on 243 442 patients, and tested on 236 506 patients. Costs associated with each patient were estimated using a validated costing algorithm. Data were analyzed from January 1, 2006, to December 31, 2016. Exposures A random sample of 2 137 343 residents of Ontario without type 2 diabetes was obtained at study start time. More than 300 features from data sets capturing demographic information, laboratory measurements, drug benefits, health care system interactions, social determinants of health, and ambulatory care and hospitalization records were compiled over 2-year patient medical histories to generate quarterly predictions. Main Outcomes and Measures Discrimination was assessed using the area under the receiver operating characteristic curve statistic, and calibration was assessed visually using calibration plots. Feature contribution was assessed with Shapley values. Costs were estimated in 2020 US dollars. Results This study trained a gradient boosting decision tree model on data from 1 657 395 patients (12 900 257 instances; 6 666 662 women [51.7%]). The developed model achieved a test area under the curve of 80.26 (range, 80.21-80.29), demonstrated good calibration, and was robust to sex, immigration status, area-level marginalization with regard to material deprivation and race/ethnicity, and low contact with the health care system. The top 5% of patients predicted as high risk by the model represented 26% of the total annual diabetes cost in Ontario. Conclusions and Relevance In this decision analytical model study, a machine learning model approach accurately predicted the incidence of diabetes in the population using routinely collected health administrative data. These results suggest that the model could be used to inform decision-making for population health planning and diabetes prevention.
Collapse
Affiliation(s)
- Mathieu Ravaut
- Layer 6 AI, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Vinyas Harish
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
- Temerty Centre for Artificial Intelligence Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | | | | | | | - Kathy Kornas
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Tristan Watson
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute of Clinical Evaluative Sciences (ICES), Toronto, Ontario, Canada
| | | | - Laura C. Rosella
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
- Temerty Centre for Artificial Intelligence Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
- Institute of Clinical Evaluative Sciences (ICES), Toronto, Ontario, Canada
- Institute for Better Health, Trillium Health Partners, Mississauga, Ontario, Canada
| |
Collapse
|
25
|
Deberneh HM, Kim I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:3317. [PMID: 33806973 PMCID: PMC8004981 DOI: 10.3390/ijerph18063317] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/15/2021] [Accepted: 03/17/2021] [Indexed: 12/17/2022]
Abstract
Prediction of type 2 diabetes (T2D) occurrence allows a person at risk to take actions that can prevent onset or delay the progression of the disease. In this study, we developed a machine learning (ML) model to predict T2D occurrence in the following year (Y + 1) using variables in the current year (Y). The dataset for this study was collected at a private medical institute as electronic health records from 2013 to 2018. To construct the prediction model, key features were first selected using ANOVA tests, chi-squared tests, and recursive feature elimination methods. The resultant features were fasting plasma glucose (FPG), HbA1c, triglycerides, BMI, gamma-GTP, age, uric acid, sex, smoking, drinking, physical activity, and family history. We then employed logistic regression, random forest, support vector machine, XGBoost, and ensemble machine learning algorithms based on these variables to predict the outcome as normal (non-diabetic), prediabetes, or diabetes. Based on the experimental results, the performance of the prediction model proved to be reasonably good at forecasting the occurrence of T2D in the Korean population. The model can provide clinicians and patients with valuable predictive information on the likelihood of developing T2D. The cross-validation (CV) results showed that the ensemble models had a superior performance to that of the single models. The CV performance of the prediction models was improved by incorporating more medical history from the dataset.
Collapse
Affiliation(s)
| | - Intaek Kim
- Department of Information and Communications Engineering, Myongji University, 116 Myongji-ro, Yongin, Gyeonggi 17058, Korea;
| |
Collapse
|
26
|
Song J, Gao Y, Yin P, Li Y, Li Y, Zhang J, Su Q, Fu X, Pi H. The Random Forest Model Has the Best Accuracy Among the Four Pressure Ulcer Prediction Models Using Machine Learning Algorithms. Risk Manag Healthc Policy 2021; 14:1175-1187. [PMID: 33776495 PMCID: PMC7987326 DOI: 10.2147/rmhp.s297838] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 02/26/2021] [Indexed: 12/11/2022] Open
Abstract
Purpose Build machine learning models for predicting pressure ulcer nursing adverse event, and find an optimal model that predicts the occurrence of pressure ulcer accurately. Patients and Methods Retrospectively enrolled 5814 patients, of which 1673 suffer from pressure ulcer events. Support vector machine (SVM), decision tree (DT), random forest (RF) and artificial neural network (ANN) models were used to construct the pressure ulcer prediction models, respectively. A total of 19 variables are included, and the importance of screening variables is evaluated. Meanwhile, the performance of the prediction models is evaluated and compared. Results The experimental results show that the four pressure ulcer prediction models all achieve good performance. Also, the AUC values of the four models are all greater than 0.95. Besides, the comparison of the four models indicates that RF model achieves a higher accuracy for the prediction of pressure ulcer. Conclusion This research verifies the feasibility of developing a management system for predicting nursing adverse event based on big data technology and machine learning technology. The random forest and decision tree model are more suitable for constructing a pressure ulcer prediction model. This study provides a reference for future pressure ulcer risk warning based on big data.
Collapse
Affiliation(s)
- Jie Song
- Medical School of Chinese PLA, Beijing, People's Republic of China
| | - Yuan Gao
- First Medical Center, Chinese PLA General Hospital, Beijing, People's Republic of China
| | - Pengbin Yin
- Fouth Medical Center, Chinese PLA General Hospital, Beijing, People's Republic of China
| | - Yi Li
- Medical School of Chinese PLA, Beijing, People's Republic of China
| | - Yang Li
- First Medical Center, Chinese PLA General Hospital, Beijing, People's Republic of China
| | - Jie Zhang
- Sixth Medical Center, Chinese PLA General Hospital, Beijing, People's Republic of China
| | - Qingqing Su
- Medical School of Chinese PLA, Beijing, People's Republic of China
| | - Xiaojie Fu
- First Medical Center, Chinese PLA General Hospital, Beijing, People's Republic of China
| | - Hongying Pi
- Medical Service Training Center, Chinese PLA General Hospital, Beijing, People's Republic of China
| |
Collapse
|
27
|
A Non-invasive Approach to Identify Insulin Resistance with Triglycerides and HDL-c Ratio Using Machine learning. Neural Process Lett 2021. [DOI: 10.1007/s11063-021-10461-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
28
|
Zippel-Schultz B, Schultz C, Müller-Wieland D, Remppis AB, Stockburger M, Perings C, Helms TM. [Artificial intelligence in cardiology : Relevance, current applications, and future developments]. Herzschrittmacherther Elektrophysiol 2021; 32:89-98. [PMID: 33449234 DOI: 10.1007/s00399-020-00735-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 12/18/2020] [Indexed: 10/22/2022]
Abstract
Big data and applications of artificial intelligence (AI), such as machine learning or deep learning, will enrich healthcare in the future and become increasingly important. Among other things, they have the potential to avoid unnecessary examinations as well as diagnostic and therapeutic errors. They could enable improved, early and accelerated decision-making. In the article, the authors provide an overview of current AI-based applications in cardiology. The examples describe innovative solutions for risk assessment, diagnosis and therapy support up to patient self-management. Big data and AI serve as a basis for efficient, predictive, preventive and personalised medicine. However, the examples also show that research is needed to further develop the solutions for the benefit of the patient and the medical profession, to demonstrate the effectiveness and benefits in health care and to establish legal and ethical standards.
Collapse
Affiliation(s)
| | - Carsten Schultz
- Lehrstuhl für Technologiemanagement, Christian-Albrechts-Universität zu Kiel, Kiel, Deutschland
| | - Dirk Müller-Wieland
- Medizinische Klinik I - Kardiologie, Angiologie und Internistische Intensivmedizin, Uniklinik RWTH Aachen, Aachen, Deutschland
| | - Andrew B Remppis
- Klinik für Kardiologie, Herz- und Gefässzentrum Bad Bevensen, Bad Bevensen, Deutschland
| | - Martin Stockburger
- Medizinische Klinik Nauen, Schwerpunkt Kardiologie, Havelland Kliniken, Nauen, Deutschland
| | - Christian Perings
- Medizinische Klinik 1, St.-Marien-Hospital Lünen, Lünen, Deutschland
| | - Thomas M Helms
- Deutsche Stiftung für chronisch Kranke, Fürth, Deutschland. .,Peri Cor Arbeitsgruppe Kardiologie/Ass. UCSF, Hamburg, Deutschland.
| |
Collapse
|
29
|
Stolfi P, Valentini I, Palumbo MC, Tieri P, Grignolio A, Castiglione F. Potential predictors of type-2 diabetes risk: machine learning, synthetic data and wearable health devices. BMC Bioinformatics 2020; 21:508. [PMID: 33308172 PMCID: PMC7733701 DOI: 10.1186/s12859-020-03763-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 09/17/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The aim of a recent research project was the investigation of the mechanisms involved in the onset of type 2 diabetes in the absence of familiarity. This has led to the development of a computational model that recapitulates the aetiology of the disease and simulates the immunological and metabolic alterations linked to type-2 diabetes subjected to clinical, physiological, and behavioural features of prototypical human individuals. RESULTS We analysed the time course of 46,170 virtual subjects, experiencing different lifestyle conditions. We then set up a statistical model able to recapitulate the simulated outcomes. CONCLUSIONS The resulting machine learning model adequately predicts the synthetic dataset and can, therefore, be used as a computationally-cheaper version of the detailed mathematical model, ready to be implemented on mobile devices to allow self-assessment by informed and aware individuals. The computational model used to generate the dataset of this work is available as a web-service at the following address: http://kraken.iac.rm.cnr.it/T2DM .
Collapse
Affiliation(s)
- Paola Stolfi
- Institute for Applied Mathematics, National Research Council of Italy, Rome, Italy
| | | | | | - Paolo Tieri
- Institute for Applied Mathematics, National Research Council of Italy, Rome, Italy
| | - Andrea Grignolio
- Research Ethics and Integrity Interdepartmental Center, National Research Council of Italy, Rome, Italy
- Medical Humanities - International MD Program, Vita-Salute San Raffaele University, Milan, Italy
| | - Filippo Castiglione
- Institute for Applied Mathematics, National Research Council of Italy, Rome, Italy
| |
Collapse
|
30
|
Basu S, Johnson KT, Berkowitz SA. Use of Machine Learning Approaches in Clinical Epidemiological Research of Diabetes. Curr Diab Rep 2020; 20:80. [PMID: 33270183 DOI: 10.1007/s11892-020-01353-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/26/2020] [Indexed: 12/12/2022]
Abstract
PURPOSE OF REVIEW Machine learning approaches-which seek to predict outcomes or classify patient features by recognizing patterns in large datasets-are increasingly applied to clinical epidemiology research on diabetes. Given its novelty and emergence in fields outside of biomedical research, machine learning terminology, techniques, and research findings may be unfamiliar to diabetes researchers. Our aim was to present the use of machine learning approaches in an approachable way, drawing from clinical epidemiological research in diabetes published from 1 Jan 2017 to 1 June 2020. RECENT FINDINGS Machine learning approaches using tree-based learners-which produce decision trees to help guide clinical interventions-frequently have higher sensitivity and specificity than traditional regression models for risk prediction. Machine learning approaches using neural networking and "deep learning" can be applied to medical image data, particularly for the identification and staging of diabetic retinopathy and skin ulcers. Among the machine learning approaches reviewed, researchers identified new strategies to develop standard datasets for rigorous comparisons across older and newer approaches, methods to illustrate how a machine learner was treating underlying data, and approaches to improve the transparency of the machine learning process. Machine learning approaches have the potential to improve risk stratification and outcome prediction for clinical epidemiology applications. Achieving this potential would be facilitated by use of universal open-source datasets for fair comparisons. More work remains in the application of strategies to communicate how the machine learners are generating their predictions.
Collapse
Affiliation(s)
- Sanjay Basu
- Center for Primary Care, Harvard Medical School, Boston, MA, USA.
- Research and Population Health, Collective Health, San Francisco, CA, USA.
- School of Public Health, Imperial College London, London, SW7, UK.
| | - Karl T Johnson
- General Medicine and Clinical Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Seth A Berkowitz
- General Medicine and Clinical Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
31
|
Raja JB, Pandian SC. PSO-FCM based data mining model to predict diabetic disease. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 196:105659. [PMID: 32698060 DOI: 10.1016/j.cmpb.2020.105659] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 07/07/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVE Diabetic disease is typically composed because of higher than normal blood sugar levels. Instead the production of insulin may be regarded insufficient. It has been noted in recent days that the percentage of diabetes-affected patients have grown to a larger extent throughout the world. Evidently, this problem must be taken more seriously in the coming days to ensure that the average percentages of diabetes-affected individuals are reduced. Recently, several research teams conducted detailed research on the data mining platform to determine the precision of each other. Data mining can be used by parametric modeling from the health data, including diabetic patient data sets, to synthesize expertise in the field. METHODS In this study, a new model is proposed for forecasting type 2 diabetes mellitus (T2DM) based on data mining strategies. The combined Particle Swarm Optimization (PSO) and Fuzzy Clustering Means (FCM) (PSO-FCM) are used to evaluate a set of medical data relating to a diabetes diagnosis challenge. RESULTS Experiments are performed on the Pima Indians Diabetes Database. The sensitivity, specificity and accuracy metrics widely used in medical studies have been used to assess the effectiveness of the proposed system reliability. It was found that the prototype has achieved 8.26 percent more accuracy than the other methods. CONCLUSION The conclusion produced by using the method shows that, as compared with other models, the proposed PSO-FCM method delivers greater performance.
Collapse
Affiliation(s)
- J Beschi Raja
- Assistant Professor, Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India.
| | | |
Collapse
|
32
|
Dworzynski P, Aasbrenn M, Rostgaard K, Melbye M, Gerds TA, Hjalgrim H, Pers TH. Nationwide prediction of type 2 diabetes comorbidities. Sci Rep 2020; 10:1776. [PMID: 32019971 PMCID: PMC7000818 DOI: 10.1038/s41598-020-58601-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Accepted: 01/16/2020] [Indexed: 02/06/2023] Open
Abstract
Identification of individuals at risk of developing disease comorbidities represents an important task in tackling the growing personal and societal burdens associated with chronic diseases. We employed machine learning techniques to investigate to what extent data from longitudinal, nationwide Danish health registers can be used to predict individuals at high risk of developing type 2 diabetes (T2D) comorbidities. Leveraging logistic regression-, random forest- and gradient boosting models and register data spanning hospitalizations, drug prescriptions and contacts with primary care contractors from >200,000 individuals newly diagnosed with T2D, we predicted five-year risk of heart failure (HF), myocardial infarction (MI), stroke (ST), cardiovascular disease (CVD) and chronic kidney disease (CKD). For HF, MI, CVD, and CKD, register-based models outperformed a reference model leveraging canonical individual characteristics by achieving area under the receiver operating characteristic curve improvements of 0.06, 0.03, 0.04, and 0.07, respectively. The top 1,000 patients predicted to be at highest risk exhibited observed incidence ratios exceeding 4.99, 3.52, 1.97 and 4.71 respectively. In summary, prediction of T2D comorbidities utilizing Danish registers led to consistent albeit modest performance improvements over reference models, suggesting that register data could be leveraged to systematically identify individuals at risk of developing disease comorbidities.
Collapse
Affiliation(s)
- Piotr Dworzynski
- The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark
| | - Martin Aasbrenn
- The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Department of Geriatrics and Internal Medicine, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark
| | - Klaus Rostgaard
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark
| | - Mads Melbye
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Henrik Hjalgrim
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark
- Department of Haematology, Rigshospitalet, Copenhagen, Denmark
| | - Tune H Pers
- The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark.
| |
Collapse
|
33
|
Tigga NP, Garg S. Prediction of Type 2 Diabetes using Machine Learning Classification Methods. ACTA ACUST UNITED AC 2020. [DOI: 10.1016/j.procs.2020.03.336] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
34
|
|
35
|
Cadet XF, Lo-Thong O, Bureau S, Dehak R, Bessafi M. Use of Machine Learning and Infrared Spectra for Rheological Characterization and Application to the Apricot. Sci Rep 2019; 9:19197. [PMID: 31844151 PMCID: PMC6915699 DOI: 10.1038/s41598-019-55543-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 11/29/2019] [Indexed: 12/04/2022] Open
Abstract
Fast advancement of machine learning methods and constant growth of the areas of application open up new horizons for large data management and processing. Among the various types of data available for analysis, the Fourier Transform InfraRed (FTIR) spectroscopy spectra are very challenging datasets to consider. In this study, machine learning is used to analyze and predict a rheological parameter: firmness. Various statistics have been gathered including both chemistry (such as ethylene, titrable acidity or sugars) and spectra values to visualize and analyze a dataset of 731 biological samples. Two-dimensional (2D) and three-dimensional (3D) principal component analyses (PCA) are used to evaluate their ability to discriminate for one parameter: firmness. Partial least squared regression (PLSR) modeling has been carried out to predict the rheological parameter using either sixteen physicochemical parameters or only the infrared spectra. We show that (i) the spectra alone allows good discrimination of the samples based on rheology, (ii) 3D-PCA allows comprehensive and informative visualization of the data, and (iii) that the rheological parameters are predicted accurately using a regression method such as PLSR; instead of using chemical parameters which are laborious to obtain, Mid-FTIR spectra gathering all physicochemical information could be used for efficient prediction of firmness. As a conclusion, rheological and chemical parameters allow good discrimination of the samples according to their firmness. However, using only the IR spectra leads to better results. A good predictive model was built for the prediction of the firmness of the fruit, and we reached a coefficient of determination R2 value of 0.90. This method outperforms a model based on physicochemical descriptors only. Such an approach could be very helpful to technologists and farmers.
Collapse
Affiliation(s)
- Xavier F Cadet
- PEACCEL, Protein Engineering Accelerator, 6 square Albin Cachot, box 42, 75013, Paris, France. .,LSE laboratory, EPITA, Paris, 94276, France.
| | - Ophélie Lo-Thong
- University of Paris, UMR_S1134, BIGR, Inserm, F-75015, Paris, France.,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, F-97715, Saint-Denis, France
| | - Sylvie Bureau
- UMR408 SQPOV, Sécurité et Qualité des Produits d'Origine Végétale, INRA, Avignon University, F-84000, Avignon, France
| | - Reda Dehak
- LSE laboratory, EPITA, Paris, 94276, France
| | - Miloud Bessafi
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, 97444, St Denis Cedex, France
| |
Collapse
|