1
|
Patro KK, Allam JP, Sanapala U, Marpu CK, Samee NA, Alabdulhafith M, Plawiak P. An effective correlation-based data modeling framework for automatic diabetes prediction using machine and deep learning techniques. BMC Bioinformatics 2023; 24:372. [PMID: 37784049 PMCID: PMC10544445 DOI: 10.1186/s12859-023-05488-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 09/19/2023] [Indexed: 10/04/2023] Open
Abstract
The rising risk of diabetes, particularly in emerging countries, highlights the importance of early detection. Manual prediction can be a challenging task, leading to the need for automatic approaches. The major challenge with biomedical datasets is data scarcity. Biomedical data is often difficult to obtain in large quantities, which can limit the ability to train deep learning models effectively. Biomedical data can be noisy and inconsistent, which can make it difficult to train accurate models. To overcome the above-mentioned challenges, this work presents a new framework for data modeling that is based on correlation measures between features and can be used to process data effectively for predicting diabetes. The standard, publicly available Pima Indians Medical Diabetes (PIMA) dataset is utilized to verify the effectiveness of the proposed techniques. Experiments using the PIMA dataset showed that the proposed data modeling method improved the accuracy of machine learning models by an average of 9%, with deep convolutional neural network models achieving an accuracy of 96.13%. Overall, this study demonstrates the effectiveness of the proposed strategy in the early and reliable prediction of diabetes.
Collapse
Affiliation(s)
- Kiran Kumar Patro
- Department of ECE, Aditya Institute of Technology and Management, Tekkali, AP, 532201, India
| | - Jaya Prakash Allam
- School of Computer Science and Engineering, VIT Vellore, Katpadi, Vellore, Tamil Nadu, 632014, India.
| | | | - Chaitanya Kumar Marpu
- Department of ECE, Aditya Institute of Technology and Management, Tekkali, AP, 532201, India
| | - Nagwan Abdel Samee
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
| | - Maali Alabdulhafith
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
| | - Pawel Plawiak
- Department of Computer Science, Faculty of Computer Science and Telecommunications, Cracow University of Technology, Warszawska 24, 31-155, Krakow, Poland
- Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Bałtycka 5, 44-100, Gliwice, Poland
| |
Collapse
|
2
|
Daskalaki E, Parkinson A, Brew-Sam N, Hossain MZ, O'Neal D, Nolan CJ, Suominen H. The Potential of Current Noninvasive Wearable Technology for the Monitoring of Physiological Signals in the Management of Type 1 Diabetes: Literature Survey. J Med Internet Res 2022; 24:e28901. [PMID: 35394448 PMCID: PMC9034434 DOI: 10.2196/28901] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 12/06/2021] [Accepted: 12/23/2021] [Indexed: 11/13/2022] Open
Abstract
Background Monitoring glucose and other parameters in persons with type 1 diabetes (T1D) can enhance acute glycemic management and the diagnosis of long-term complications of the disease. For most persons living with T1D, the determination of insulin delivery is based on a single measured parameter—glucose. To date, wearable sensors exist that enable the seamless, noninvasive, and low-cost monitoring of multiple physiological parameters. Objective The objective of this literature survey is to explore whether some of the physiological parameters that can be monitored with noninvasive, wearable sensors may be used to enhance T1D management. Methods A list of physiological parameters, which can be monitored by using wearable sensors available in 2020, was compiled by a thorough review of the devices available in the market. A literature survey was performed using search terms related to T1D combined with the identified physiological parameters. The selected publications were restricted to human studies, which had at least their abstracts available. The PubMed and Scopus databases were interrogated. In total, 77 articles were retained and analyzed based on the following two axes: the reported relations between these parameters and T1D, which were found by comparing persons with T1D and healthy control participants, and the potential areas for T1D enhancement via the further analysis of the found relationships in studies working within T1D cohorts. Results On the basis of our search methodology, 626 articles were returned, and after applying our exclusion criteria, 77 (12.3%) articles were retained. Physiological parameters with potential for monitoring by using noninvasive wearable devices in persons with T1D included those related to cardiac autonomic function, cardiorespiratory control balance and fitness, sudomotor function, and skin temperature. Cardiac autonomic function measures, particularly the indices of heart rate and heart rate variability, have been shown to be valuable in diagnosing and monitoring cardiac autonomic neuropathy and, potentially, predicting and detecting hypoglycemia. All identified physiological parameters were shown to be associated with some aspects of diabetes complications, such as retinopathy, neuropathy, and nephropathy, as well as macrovascular disease, with capacity for early risk prediction. However, although they can be monitored by available wearable sensors, most studies have yet to adopt them, as opposed to using more conventional devices. Conclusions Wearable sensors have the potential to augment T1D sensing with additional, informative biomarkers, which can be monitored noninvasively, seamlessly, and continuously. However, significant challenges associated with measurement accuracy, removal of noise and motion artifacts, and smart decision-making exist. Consequently, research should focus on harvesting the information hidden in the complex data generated by wearable sensors and on developing models and smart decision strategies to optimize the incorporation of these novel inputs into T1D interventions.
Collapse
Affiliation(s)
- Elena Daskalaki
- School of Computing, College of Engineering and Computer Science, The Australian National University, Canberra, Australia
| | - Anne Parkinson
- Department of Health Services Research and Policy, Research School of Population Health, College of Health and Medicine, The Australian National University, Canberra, Australia
| | - Nicola Brew-Sam
- Department of Health Services Research and Policy, Research School of Population Health, College of Health and Medicine, The Australian National University, Canberra, Australia
| | - Md Zakir Hossain
- School of Computing, College of Engineering and Computer Science, The Australian National University, Canberra, Australia.,School of Biology, College of Science, The Australian National University, Canberra, Australia.,Bioprediction Activity, Commonwealth Industrial and Scientific Research Organisation, Canberra, Australia
| | - David O'Neal
- Department of Medicine, University of Melbourne, Melbourne, Australia.,Department of Endocrinology and Diabetes, St Vincent's Hospital Melbourne, Melbourne, Australia
| | - Christopher J Nolan
- Australian National University Medical School and John Curtin School of Medical Research, College of Health and Medicine, The Autralian National University, Canberra, Australia.,Department of Diabetes and Endocrinology, The Canberra Hospital, Canberra, Australia
| | - Hanna Suominen
- School of Computing, College of Engineering and Computer Science, The Australian National University, Canberra, Australia.,Data61, Commonwealth Industrial and Scientific Research Organisation, Canberra, Australia.,Department of Computing, University of Turku, Turku, Finland
| |
Collapse
|
3
|
A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11041742] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Type 1 diabetes mellitus (DM1) is a metabolic disease derived from falls in pancreatic insulin production resulting in chronic hyperglycemia. DM1 subjects usually have to undertake a number of assessments of blood glucose levels every day, employing capillary glucometers for the monitoring of blood glucose dynamics. In recent years, advances in technology have allowed for the creation of revolutionary biosensors and continuous glucose monitoring (CGM) techniques. This has enabled the monitoring of a subject’s blood glucose level in real time. On the other hand, few attempts have been made to apply machine learning techniques to predicting glycaemia levels, but dealing with a database containing such a high level of variables is problematic. In this sense, to the best of the authors’ knowledge, the issues of proper feature selection (FS)—the stage before applying predictive algorithms—have not been subject to in-depth discussion and comparison in past research when it comes to forecasting glycaemia. Therefore, in order to assess how a proper FS stage could improve the accuracy of the glycaemia forecasted, this work has developed six FS techniques alongside four predictive algorithms, applying them to a full dataset of biomedical features related to glycaemia. These were harvested through a wide-ranging passive monitoring process involving 25 patients with DM1 in practical real-life scenarios. From the obtained results, we affirm that Random Forest (RF) as both predictive algorithm and FS strategy offers the best average performance (Root Median Square Error, RMSE = 18.54 mg/dL) throughout the 12 considered predictive horizons (up to 60 min in steps of 5 min), showing Support Vector Machines (SVM) to have the best accuracy as a forecasting algorithm when considering, in turn, the average of the six FS techniques applied (RMSE = 20.58 mg/dL).
Collapse
|
4
|
Investigating Health-Related Features and Their Impact on the Prediction of Diabetes Using Machine Learning. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11031173] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Diabetes Mellitus (DM) is one of the most common chronic diseases leading to severe health complications that may cause death. The disease influences individuals, community, and the government due to the continuous monitoring, lifelong commitment, and the cost of treatment. The World Health Organization (WHO) considers Saudi Arabia as one of the top 10 countries in diabetes prevalence across the world. Since most of its medical services are provided by the government, the cost of the treatment in terms of hospitals and clinical visits and lab tests represents a real burden due to the large scale of the disease. The ability to predict the diabetic status of a patient with only a handful of features can allow cost-effective, rapid, and widely-available screening of diabetes, thereby lessening the health and economic burden caused by diabetes alone. The goal of this paper is to investigate the prediction of diabetic patients and compare the role of HbA1c and FPG as input features. By using five different machine learning classifiers, and using feature elimination through feature permutation and hierarchical clustering, we established good performance for accuracy, precision, recall, and F1-score of the models on the dataset implying that our data or features are not bound to specific models. In addition, the consistent performance across all the evaluation metrics indicate that there was no trade-off or penalty among the evaluation metrics. Further analysis was performed on the data to identify the risk factors and their indirect impact on diabetes classification. Our analysis presented great agreement with the risk factors of diabetes and prediabetes stated by the American Diabetes Association (ADA) and other health institutions worldwide. We conclude that by performing analysis of the disease using selected features, important factors specific to the Saudi population can be identified, whose management can result in controlling the disease. We also provide some recommendations learned from this research.
Collapse
|
5
|
Prediction of Metabolic Syndrome in a Mexican Population Applying Machine Learning Algorithms. Symmetry (Basel) 2020. [DOI: 10.3390/sym12040581] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Metabolic syndrome is a health condition that increases the risk of heart diseases, diabetes, and stroke. The prognostic variables that identify this syndrome have already been defined by the World Health Organization (WHO), the National Cholesterol Education Program Third Adult Treatment Panel (ATP III) as well as by the International Diabetes Federation. According to these guides, there is some symmetry among anthropometric prognostic variables to classify abdominal obesity in people with metabolic syndrome. However, some appear to be more sensitive than others, nevertheless, these proposed definitions have failed to appropriately classify a specific population or ethnic group. In this work, we used the ATP III criteria as the framework with the purpose to rank the health parameters (clinical and anthropometric measurements, lifestyle data, and blood tests) from a data set of 2942 participants of Mexico City Tlalpan 2020 cohort, applying machine learning algorithms. We aimed to find the most appropriate prognostic variables to classify Mexicans with metabolic syndrome. The criteria of sensitivity, specificity, and balanced accuracy were used for validation. The ATP III using Waist-to-Height-Ratio (WHtR) as an anthropometric index for the diagnosis of abdominal obesity achieved better performance in classification than waist or body mass index. Further work is needed to assess its precision as a classification tool for Metabolic Syndrome in a Mexican population.
Collapse
|