1
|
Mroz T, Griffin M, Cartabuke R, Laffin L, Russo-Alvarez G, Thomas G, Smedira N, Meese T, Shost M, Habboub G. Predicting hypertension control using machine learning. PLoS One 2024; 19:e0299932. [PMID: 38507433 PMCID: PMC10954144 DOI: 10.1371/journal.pone.0299932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 02/17/2024] [Indexed: 03/22/2024] Open
Abstract
Hypertension is a widely prevalent disease and uncontrolled hypertension predisposes affected individuals to severe adverse effects. Though the importance of controlling hypertension is clear, the multitude of therapeutic regimens and patient factors that affect the success of blood pressure control makes it difficult to predict the likelihood to predict whether a patient's blood pressure will be controlled. This project endeavors to investigate whether machine learning can accurately predict the control of a patient's hypertension within 12 months of a clinical encounter. To build the machine learning model, a retrospective review of the electronic medical records of 350,008 patients 18 years of age and older between January 1, 2015 and June 1, 2022 was performed to form model training and testing cohorts. The data included in the model included medication combinations, patient laboratory values, vital sign measurements, comorbidities, healthcare encounters, and demographic information. The mean age of the patient population was 65.6 years with 161,283 (46.1%) men and 275,001 (78.6%) white. A sliding time window of data was used to both prohibit data leakage from training sets to test sets and to maximize model performance. This sliding window resulted in using the study data to create 287 predictive models each using 2 years of training data and one week of testing data for a total study duration of five and a half years. Model performance was combined across all models. The primary outcome, prediction of blood pressure control within 12 months demonstrated an area under the curve of 0.76 (95% confidence interval; 0.75-0.76), sensitivity of 61.52% (61.0-62.03%), specificity of 75.69% (75.25-76.13%), positive predictive value of 67.75% (67.51-67.99%), and negative predictive value of 70.49% (70.32-70.66%). An AUC of 0.756 is considered to be moderately good for machine learning models. While the accuracy of this model is promising, it is impossible to state with certainty the clinical relevancy of any clinical support ML model without deploying it in a clinical setting and studying its impact on health outcomes. By also incorporating uncertainty analysis for every prediction, the authors believe that this approach offers the best-known solution to predicting hypertension control and that machine learning may be able to improve the accuracy of hypertension control predictions using patient information already available in the electronic health record. This method can serve as a foundation with further research to strengthen the model accuracy and to help determine clinical relevance.
Collapse
Affiliation(s)
- Thomas Mroz
- Orthopaedics and Rheumatology Institute, Cleveland Clinic, Cleveland, OH, United States of America
- Center for Spine Health, Cleveland Clinic, Cleveland, OH, United States of America
| | - Michael Griffin
- Insight Enterprises Inc., Chandler, AZ, United States of America
| | - Richard Cartabuke
- Department of Internal Medicine, Cleveland Clinic, Cleveland, OH, United States of America
| | - Luke Laffin
- Department of Cardiovascular Medicine, Center for Blood Pressure Disorders, Cleveland Clinic, Cleveland, OH, United States of America
| | - Giavanna Russo-Alvarez
- Department of Hospital Outpatient Pharmacy, Cleveland Clinic, Cleveland, OH, United States of America
| | - George Thomas
- Department of Kidney Medicine, Cleveland Clinic, Cleveland, OH, United States of America
| | - Nicholas Smedira
- Department of Thoracic and Cardiovascular Surgery, Cleveland Clinic, Cleveland, OH, United States of America
| | - Thad Meese
- Department of Innovations Technology Development, Cleveland Clinic, Cleveland, OH, United States of America
| | - Michael Shost
- Center for Spine Health, Cleveland Clinic, Cleveland, OH, United States of America
- Case Western Reserve University School of Medicine, Cleveland, OH, United States of America
| | - Ghaith Habboub
- Center for Spine Health, Cleveland Clinic, Cleveland, OH, United States of America
| |
Collapse
|
2
|
John M, Shaiba H. Identification of self-care problem in children using machine learning. Heliyon 2024; 10:e26977. [PMID: 38463780 PMCID: PMC10923687 DOI: 10.1016/j.heliyon.2024.e26977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 02/14/2024] [Accepted: 02/22/2024] [Indexed: 03/12/2024] Open
Abstract
Identification of self-care problems in children is a challenging task for medical professionals owing to its complexity and time consumption. Furthermore, the shortage of occupational therapists worldwide makes the task more challenging. Machine learning methods have come to the aid of reducing the complexity associated with problems in diverse fields. This paper employs machine learning based models to identify whether a child suffers from self-care problems using SCADI dataset. The dataset exhibited high dimensionality and imbalance. Initially, the dataset was converted into lower dimensionality. Imbalanced dataset is likely to affect the performance of machine learning models. To address this issue, SMOTE oversampling method was used to reduce the wide variations in the class distribution. The classification methods used were Naïve bayes, J48 and random forest. Random forest classifier which was operated on SMOTE balanced data obtained the best classification performance with balanced accuracy of 99%. The classification model outperformed the existing expert systems.
Collapse
Affiliation(s)
- Maya John
- Artificial Intelligence and Data Analytics (AIDA) Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
| | - Hadil Shaiba
- Department of Computer Science, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| |
Collapse
|
3
|
Reza MS, Amin R, Yasmin R, Kulsum W, Ruhi S. Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data. Heliyon 2024; 10:e24536. [PMID: 38312584 PMCID: PMC10834804 DOI: 10.1016/j.heliyon.2024.e24536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 01/06/2024] [Accepted: 01/10/2024] [Indexed: 02/06/2024] Open
Abstract
Diabetes mellitus, a chronic metabolic disorder, continues to be a major public health issue around the world. It is estimated that one in every two diabetics is undiagnosed. Early diagnosis and management of diabetes can also prevent or delay the onset of complications. With the help of a variety of machine learning and deep learning models, stacking algorithms, and other techniques, our study's goal is to detect diseases early. In this study, we propose two stacking-based models for diabetes disease classification using a combination of the PIMA Indian diabetes dataset, simulated data, and additional data collected from a local healthcare facility. We use both the classical and deep neural network stacking ensemble methods to combine the predictions of multiple classification models and improve classification accuracy and robustness. In the evaluation protocol, we used both the train-test and cross-validation (CV) techniques to validate our proposed model. The highest accuracy is obtained by stacking ensemble with three NN architectures, resulting in an accuracy of 95.50 %, precision of 94 %, recall of 97 %, and f1-score of 96 % using 5-fold CV on simulation study. The stacked accuracy obtained from ML algorithms for the Pima Indian Diabetes dataset is 75.03 % using the train-test split protocol, while the accuracy obtained from the CV protocol is 77.10 % on the stacked model. The range of performance scores that outperformed the CV protocol 2.23 %-12 %. Our proposed method achieves a high accuracy range from 92 % to 95 %, precision, recall, and F1-score ranges from 88 % to 96 % using classical and deep neural network (NN)-based stacking method on the primary dataset. The proposed dataset and ensemble method could be useful in the early detection and treatment of diabetes, as well as in the advancement of machine learning and data analysis techniques in the healthcare industry.
Collapse
Affiliation(s)
- Md Shamim Reza
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Ruhul Amin
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Rubia Yasmin
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Woomme Kulsum
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Sabba Ruhi
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| |
Collapse
|
4
|
Talari P, N B, Kaur G, Alshahrani H, Al Reshan MS, Sulaiman A, Shaikh A. Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2. PLoS One 2024; 19:e0292100. [PMID: 38236900 PMCID: PMC10796060 DOI: 10.1371/journal.pone.0292100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 09/12/2023] [Indexed: 01/22/2024] Open
Abstract
Diabetes prediction is an ongoing study topic in which medical specialists are attempting to forecast the condition with greater precision. Diabetes typically stays lethargic, and on the off chance that patients are determined to have another illness, like harm to the kidney vessels, issues with the retina of the eye, or a heart issue, it can cause metabolic problems and various complexities in the body. Various worldwide learning procedures, including casting a ballot, supporting, and sacking, have been applied in this review. The Engineered Minority Oversampling Procedure (Destroyed), along with the K-overlay cross-approval approach, was utilized to achieve class evening out and approve the discoveries. Pima Indian Diabetes (PID) dataset is accumulated from the UCI Machine Learning (UCI ML) store for this review, and this dataset was picked. A highlighted engineering technique was used to calculate the influence of lifestyle factors. A two-phase classification model has been developed to predict insulin resistance using the Sequential Minimal Optimisation (SMO) and SMOTE approaches together. The SMOTE technique is used to preprocess data in the model's first phase, while SMO classes are used in the second phase. All other categorization techniques were outperformed by bagging decision trees in terms of Misclassification Error rate, Accuracy, Specificity, Precision, Recall, F1 measures, and ROC curve. The model was created using a combined SMOTE and SMO strategy, which achieved 99.07% correction with 0.1 ms of runtime. The suggested system's result is to enhance the classifier's performance in spotting illness early.
Collapse
Affiliation(s)
- Praveen Talari
- Department of Computer Science and Engineering, Vignana Bharathi Institute of Technology, Hyderabad, India
| | - Bharathiraja N
- Chitkara University Institute of Engineering and Technology, Chitkara University Punjab, Rajpura, India
| | - Gaganpreet Kaur
- Chitkara University Institute of Engineering and Technology, Chitkara University Punjab, Rajpura, India
| | - Hani Alshahrani
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
| | - Mana Saleh Al Reshan
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
- Scientific and Engineering Research Centre, Najran University, Najran, Saudi Arabia
| | - Adel Sulaiman
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
| | - Asadullah Shaikh
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
| |
Collapse
|
5
|
Nguyen T, Mengersen K, Sous D, Liquet B. SMOTE-CD: SMOTE for compositional data. PLoS One 2023; 18:e0287705. [PMID: 37384667 PMCID: PMC10309641 DOI: 10.1371/journal.pone.0287705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 06/12/2023] [Indexed: 07/01/2023] Open
Abstract
Compositional data are a special kind of data, represented as a proportion carrying relative information. Although this type of data is widely spread, no solution exists to deal with the cases where the classes are not well balanced. After describing compositional data imbalance, this paper proposes an adaptation of the original Synthetic Minority Oversampling TEchnique (SMOTE) to deal with compositional data imbalance. The new approach, called SMOTE for Compositional Data (SMOTE-CD), generates synthetic examples by computing a linear combination of selected existing data points, using compositional data operations. The performance of the SMOTE-CD is tested with three different regressors (Gradient Boosting tree, Neural Networks, Dirichlet regressor) applied to two real datasets and to synthetic generated data, and the performance is evaluated using accuracy, cross-entropy, F1-score, R2 score and RMSE. The results show improvements across all metrics, but the impact of oversampling on performance varies depending on the model and the data. In some cases, oversampling may lead to a decrease in performance for the majority class. However, for the real data, the best performance across all models is achieved when oversampling is used. Notably, the F1-score is consistently increased with oversampling. Unlike the original technique, the performance is not improved when combining oversampling of the minority classes and undersampling of the majority class. The Python package smote-cd implements the method and is available online.
Collapse
Affiliation(s)
- Teo Nguyen
- Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l’Adour, E2S UPPA, CNRS, Anglet, France
- School of Mathematics and Physical Sciences, Macquarie University, Sydney, NSW, Australia
| | - Kerrie Mengersen
- Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l’Adour, E2S UPPA, CNRS, Anglet, France
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia
| | - Damien Sous
- Laboratoire des Sciences Pour l’ingénieur Appliquées à la Mécanique et au Génie Électrique, Université de Pau et des Pays de l’Adour, E2S UPPA, Anglet, France
- Mediterranean Institute of Oceanography, Université de Toulon, Aix Marseille Université, CNRS, IRD, La Garde, France
| | - Benoit Liquet
- Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l’Adour, E2S UPPA, CNRS, Anglet, France
- School of Mathematics and Physical Sciences, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|
6
|
Ebrahim OA, Derbew G. Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021. Sci Rep 2023; 13:7779. [PMID: 37179444 PMCID: PMC10182985 DOI: 10.1038/s41598-023-34906-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Accepted: 05/09/2023] [Indexed: 05/15/2023] Open
Abstract
Ethiopia has been challenged by the growing magnitude of diabetes in general and type-2 diabetes in particular. Knowledge extraction from stored dataset can be an important base for better decision on diabetes rapid diagnosis, suggestive on prediction for early intervention. Thus, this study was addressed these problem by application of supervised machine learning algorithms for classification and prediction of type 2 diabetes disease status and might provide context-specific information to program planners and policy makers so that, priority will be given to the more affected groups. To apply supervised machine learning algorithms; compare these algorithms and select the best algorithm based on their performance for classification and prediction of type-2 diabetic disease status (positive or negative) in public hospitals of Afar regional state, Northeastern Ethiopia. This study was conducted at Afar regional state from February to June of 2021. Decision tree; pruned J 48, Artificial neural network, K-nearest neighbor, Support vector machine, Binary logistic regression, Random forest, and Naïve Bayes supervised machine learning algorithms were applied using secondary data from the medical database record review. A total of 2239 sample Dataset diagnosed for diabetes from 2012 to April 22/2020 (1523 with type-2 diabetes and 716 without type-2 diabetes) was checked for its completeness prior to analysis. For all algorithms, WEKA3.7 tool was used for analysis purposes. Moreover, all algorithms were compared based on their correctly classification rate, kappa statistics, confusion matrix, area under the curve, sensitivity, and specificity. From the seven major supervised machine learning algorithms, the best classification and prediction results were obtained from random forest [correctly classified rate (93.8%), kappa statistics (0.85), sensitivity (0.98), area under the curve (0.97) and confusion matrix (out of 454 actual positive prediction for 446)] which was followed by decision tree pruned J 48 [correctly classified rate (91.8%), kappa statistics (0.80), sensitivity (0.96), area under the curve (0.91) and confusion matrices (out of 454 actual positive prediction for 438)] and k-nearest neighbor [correctly classified rate (89.8%), kappa statistics (0.76), sensitivity (0.92), area under the curve (0.88) and confusion matrices (out of 454 actual positive prediction for 421)]. Random forest, Decision tree pruned J48 and k-nearest neighbor algorithms have better classification and prediction performance for classifying and predicting type-2 diabetes disease status. Therefore, based on this performance, random forest algorithm can be judged as suggestive and supportive for clinicians at the time of type-2 diabetes diagnosis.
Collapse
Affiliation(s)
- Oumer Abdulkadir Ebrahim
- Department of Public Health, College of Medical and Health Science, Samara University, Samara, Ethiopia.
| | - Getachew Derbew
- College of Veterinary Medicine, Samara University, Samara, Ethiopia
| |
Collapse
|
7
|
Goel A, Goel AK, Kumar A. Performance analysis of multiple input single layer neural network hardware chip. MULTIMEDIA TOOLS AND APPLICATIONS 2023; 82:1-22. [PMID: 36846531 PMCID: PMC9939870 DOI: 10.1007/s11042-023-14627-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 02/24/2022] [Accepted: 02/03/2023] [Indexed: 06/18/2023]
Abstract
An artificial neural network (ANN) is a computational system that is designed to replicate and process the behavior of the human brain using neuron nodes. ANNs are made up of thousands of processing neurons with input and output modules that self-learn and compute data to offer the best results. The hardware realization of the massive neuron system is a difficult task. The research article emphasizes the design and realization of multiple input perceptron chips in Xilinx integrated system environment (ISE) 14.7 software. The proposed single-layer ANN architecture is scalable and accepts variable 64 inputs. The design is distributed in eight parallel blocks of ANN in which one block consists of eight neurons. The performance of the chip is analyzed based on the hardware utilization, memory, combinational delay, and different processing elements with targeted hardware Virtex-5 field-programmable gate array (FPGA). The chip simulation is performed in Modelsim 10.0 software. Artificial intelligence has a wide range of applications, and cutting-edge computing technology has a vast market. Hardware processors that are fast, affordable, and suited for ANN applications and accelerators are being developed by the industries. The novelty of the work is that it provides a parallel and scalable design platform on FPGA for fast switching, which is the current need in the forthcoming neuromorphic hardware.
Collapse
Affiliation(s)
- Akash Goel
- Department of Computer Science & Engineering, Galgotia’s University, Greater Noida, NCR India
| | - Amit Kumar Goel
- Department of Computer Science & Engineering, Galgotia’s University, Greater Noida, NCR India
| | - Adesh Kumar
- Department of Electrical & Electronics Engineering, University of Petroleum and Energy Studies, Dehradun, India
| |
Collapse
|
8
|
Ghaffar Nia N, Kaplanoglu E, Nasab A. Evaluation of artificial intelligence techniques in disease diagnosis and prediction. DISCOVER ARTIFICIAL INTELLIGENCE 2023. [PMCID: PMC9885935 DOI: 10.1007/s44163-023-00049-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
AbstractA broad range of medical diagnoses is based on analyzing disease images obtained through high-tech digital devices. The application of artificial intelligence (AI) in the assessment of medical images has led to accurate evaluations being performed automatically, which in turn has reduced the workload of physicians, decreased errors and times in diagnosis, and improved performance in the prediction and detection of various diseases. AI techniques based on medical image processing are an essential area of research that uses advanced computer algorithms for prediction, diagnosis, and treatment planning, leading to a remarkable impact on decision-making procedures. Machine Learning (ML) and Deep Learning (DL) as advanced AI techniques are two main subfields applied in the healthcare system to diagnose diseases, discover medication, and identify patient risk factors. The advancement of electronic medical records and big data technologies in recent years has accompanied the success of ML and DL algorithms. ML includes neural networks and fuzzy logic algorithms with various applications in automating forecasting and diagnosis processes. DL algorithm is an ML technique that does not rely on expert feature extraction, unlike classical neural network algorithms. DL algorithms with high-performance calculations give promising results in medical image analysis, such as fusion, segmentation, recording, and classification. Support Vector Machine (SVM) as an ML method and Convolutional Neural Network (CNN) as a DL method is usually the most widely used techniques for analyzing and diagnosing diseases. This review study aims to cover recent AI techniques in diagnosing and predicting numerous diseases such as cancers, heart, lung, skin, genetic, and neural disorders, which perform more precisely compared to specialists without human error. Also, AI's existing challenges and limitations in the medical area are discussed and highlighted.
Collapse
Affiliation(s)
- Nafiseh Ghaffar Nia
- College of Engineering and Computer Science, The University of Tennessee at Chattanooga, Chattanooga, TN 37403 USA
| | - Erkan Kaplanoglu
- College of Engineering and Computer Science, The University of Tennessee at Chattanooga, Chattanooga, TN 37403 USA
| | - Ahad Nasab
- College of Engineering and Computer Science, The University of Tennessee at Chattanooga, Chattanooga, TN 37403 USA
| |
Collapse
|
9
|
Using Recurrent Neural Networks for Predicting Type-2 Diabetes from Genomic and Tabular Data. Diagnostics (Basel) 2022; 12:diagnostics12123067. [PMID: 36553074 PMCID: PMC9776641 DOI: 10.3390/diagnostics12123067] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/01/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022] Open
Abstract
The development of genomic technology for smart diagnosis and therapies for various diseases has lately been the most demanding area for computer-aided diagnostic and treatment research. Exponential breakthroughs in artificial intelligence and machine intelligence technologies could pave the way for identifying challenges afflicting the healthcare industry. Genomics is paving the way for predicting future illnesses, including cancer, Alzheimer's disease, and diabetes. Machine learning advancements have expedited the pace of biomedical informatics research and inspired new branches of computational biology. Furthermore, knowing gene relationships has resulted in developing more accurate models that can effectively detect patterns in vast volumes of data, making classification models important in various domains. Recurrent Neural Network models have a memory that allows them to quickly remember knowledge from previous cycles and process genetic data. The present work focuses on type 2 diabetes prediction using gene sequences derived from genomic DNA fragments through automated feature selection and feature extraction procedures for matching gene patterns with training data. The suggested model was tested using tabular data to predict type 2 diabetes based on several parameters. The performance of neural networks incorporating Recurrent Neural Network (RNN) components, Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU) was tested in this research. The model's efficiency is assessed using the evaluation metrics such as Sensitivity, Specificity, Accuracy, F1-Score, and Mathews Correlation Coefficient (MCC). The suggested technique predicted future illnesses with fair Accuracy. Furthermore, our research showed that the suggested model could be used in real-world scenarios and that input risk variables from an end-user Android application could be kept and evaluated on a secure remote server.
Collapse
|
10
|
Middha K, Mittal A. An effective feature selection method for type 2 diabetes mellitus detection using gene expression data. INTELLIGENT DECISION TECHNOLOGIES 2022. [DOI: 10.3233/idt-220077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Type 2 diabetes mellitus (T2DM) detection is a chronic disease, which is caused due to the insulin disorder. Moreover, the decreased secretion of insulin increased the blood glucose level, thereby the human body cannot respond with the high glucose level. The T2DM sufferers do not produce enough insulin, or it resists insulin. The symptoms of T2DM disease are increased hunger, thirst, fatigue, frequent urination and blurred vision, and in some cases, there are no symptoms. The commonly utilized treatments of T2DM are exercise, diet, insulin therapy and medication. In this paper, the Competitive Multi-Verse Rider Optimizer (CMVRO)-based hybrid deep learning scheme is devised for T2DM detection. The hybrid deep learning involves two classifiers, such as Rider based Neural Network (RideNN) and Deep Residual Network (DRN). Moreover, the comparative analysis of T2DM detection is done by comparing various feature selection approaches, such as Tanimoto similarity, Chi square (Chi-2), Fisher Score (FS), Linear Discriminant Analysis (LDA), Random Forest (RF), and Support Vector Machine recursive feature elimination (SVM-RFE) for T2DM detection. Amongst these, the tanimoto similarity feature selection approach attained the better performance with respect to the testing accuracy, sensitivity and specificity of 0.932, 0.932 and 0.914, correspondingly.
Collapse
|
11
|
Silva GFS, Fagundes TP, Teixeira BC, Chiavegatto Filho ADP. Machine Learning for Hypertension Prediction: a Systematic Review. Curr Hypertens Rep 2022; 24:523-533. [PMID: 35731335 DOI: 10.1007/s11906-022-01212-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2022] [Indexed: 01/31/2023]
Abstract
PURPOSE OF REVIEW To provide an overview of the literature regarding the use of machine learning algorithms to predict hypertension. A systematic review was performed to select recent articles on the subject. RECENT FINDINGS The screening of the articles was conducted using a machine learning algorithm (ASReview). A total of 21 articles published between January 2018 and May 2021 were identified and compared according to variable selection, train-test split, data balancing, outcome definition, final algorithm, and performance metrics. Overall, the articles achieved an area under the ROC curve (AUROC) between 0.766 and 1.00. The algorithms most frequently identified as having the best performance were support vector machines (SVM), extreme gradient boosting (XGBoost), and random forest. Machine learning algorithms are a promising tool to improve preventive clinical decisions and targeted public health policies for hypertension. However, technical factors such as outcome definition, availability of the final code, predictive performance, explainability, and data leakage need to be consistently and critically evaluated.
Collapse
Affiliation(s)
- Gabriel F S Silva
- Department of Epidemiology, School of Public Health, University of São Paulo, São Paulo, SP, Brazil
| | - Thales P Fagundes
- Laboratory of Big Data and Predictive Analysis in Healthcare, School of Public Health, University of São Paulo, São Paulo, SP, Brazil
| | - Bruno C Teixeira
- Laboratory of Big Data and Predictive Analysis in Healthcare, School of Public Health, University of São Paulo, São Paulo, SP, Brazil
| | | |
Collapse
|
12
|
Predictive Analysis of Diabetes-Risk with Class Imbalance. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:3078025. [PMID: 36268149 PMCID: PMC9578843 DOI: 10.1155/2022/3078025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 08/31/2022] [Accepted: 09/05/2022] [Indexed: 11/29/2022]
Abstract
Diabetes type 2 (T2DM) is a common chronic disease, increasingly leading to many complications and affecting vital organs. Hyperglycemia is the main characteristic caused by insufficient insulin secretion and poses a serious risk to human health. The objective is to construct a type-2 diabetes prediction model with high classification accuracy. Advanced machine learning and predictive model techniques are utilized to achieve cutting-edge techniques for the early diagnosis of diabetes. This paper proposes an efficient performance model to predict and classify the minority class of type-2 diabetes. The impact of oversampling and undersampling approaches to reduce the effect of an unbalanced class has been compared to classification performance algorithms. Synthetic Minority Oversampling (SMOTE) and Tomek-links techniques are applied and examined. The outcomes were then compared to the original unbalanced dataset using an artificial neural network (ANN) predictive model. The model is compared with other state-of-the-art classifiers such as support vector machine (SVM), random forest (RF), and decision tree (DT). The tuned model had the best accuracy of 92.2%. The experimental findings clearly manifest the improvement in accuracy and evaluation metrics in terms of AUC and F1-measure using the SMOTE oversampling strategy rather than the baseline and undersampling schemes. The study recommends adopting dynamic hyperparameter optimization to further improve accuracy.
Collapse
|
13
|
Financial Fraud Identification Based on Stacking Ensemble Learning Algorithm: Introducing MD&A Text Information. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1780834. [PMID: 36177320 PMCID: PMC9514921 DOI: 10.1155/2022/1780834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 09/01/2022] [Accepted: 09/05/2022] [Indexed: 12/04/2022]
Abstract
In recent years, there have been frequent incidents of financial fraud committed through various means. How to more efficiently identify financial fraud and maintain capital market order is a problem that scholars from all walks of life are discussing and urgently seeking to resolve. In this study, a financial fraud identification model is constructed based on the stacking ensemble learning algorithm, and the text of the management discussion and analysis (MD&A) chapter in annual reports is introduced based on financial and nonfinancial variables, using sentiment polarity, emotional tone, and text readability as text variables. The results show that when considering financial and nonfinancial variables and introducing text variables, the recognition effect of the stacking ensemble learning model constructed in this study is significantly better than the classification results of each single classifier model. In addition, the model recognition effect is better after adding text variables. Therefore, the model is expected to provide a new and more effective method of identifying financial fraud.
Collapse
|
14
|
A Framework on Performance Analysis of Mathematical Model-Based Classifiers in Detection of Epileptic Seizure from EEG Signals with Efficient Feature Selection. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:7654666. [PMID: 36110984 PMCID: PMC9470336 DOI: 10.1155/2022/7654666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 07/22/2022] [Accepted: 08/10/2022] [Indexed: 11/17/2022]
Abstract
Epilepsy is one of the neurological conditions that are diagnosed in the vast majority of patients. Electroencephalography (EEG) readings are the primary tool that is used in the process of diagnosing and analyzing epilepsy. The epileptic EEG data display the electrical activity of the neurons and provide a significant amount of knowledge on pathology and physiology. As a result of the significant amount of time that this method requires, several automated classification methods have been developed. In this paper, three wavelets such as Haar, dB4, and Sym 8 are employed to extract the features from A–E sets of the Bonn epilepsy dataset. To select the best features of epileptic seizures, a Particle Swarm Optimization (PSO) technique is applied. The extracted features are further classified using seven classifiers like linear regression, nonlinear regression, Gaussian Mixture Modeling (GMM), K-Nearest Neighbor (KNN), Support Vector Machine (SVM-linear), SVM (polynomial), and SVM Radial Basis Function (RBF). Classifier performances are analyzed through the benchmark parameters, such as sensitivity, specificity, accuracy, F1 Score, error rate, and g-means. The SVM classifier with RBF kernel in sym 8 wavelet features with PSO feature selection method attains a higher accuracy rate of 98% with an error rate of 2%. This classifier outperforms all other classifiers.
Collapse
|
15
|
Cluster-Based Improved Isolation Forest. ENTROPY 2022; 24:e24050611. [PMID: 35626495 PMCID: PMC9141139 DOI: 10.3390/e24050611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 04/24/2022] [Accepted: 04/24/2022] [Indexed: 12/10/2022]
Abstract
Outlier detection is an important research direction in the field of data mining. Aiming at the problem of unstable detection results and low efficiency caused by randomly dividing features of the data set in the Isolation Forest algorithm in outlier detection, an algorithm CIIF (Cluster-based Improved Isolation Forest) that combines clustering and Isolation Forest is proposed. CIIF first uses the k-means method to cluster the data set, selects a specific cluster to construct a selection matrix based on the results of the clustering, and implements the selection mechanism of the algorithm through the selection matrix; then builds multiple isolation trees. Finally, the outliers are calculated according to the average search length of each sample in different isolation trees, and the Top-n objects with the highest outlier scores are regarded as outliers. Through comparative experiments with six algorithms in eleven real data sets, the results show that the CIIF algorithm has better performance. Compared to the Isolation Forest algorithm, the average AUC (Area under the Curve of ROC) value of our proposed CIIF algorithm is improved by 7%.
Collapse
|
16
|
Hazarika RA, Maji AK, Syiem R, Sur SN, Kandar D. Hippocampus Segmentation Using U-Net Convolutional Network from Brain Magnetic Resonance Imaging (MRI). J Digit Imaging 2022; 35:893-909. [PMID: 35304675 PMCID: PMC9485390 DOI: 10.1007/s10278-022-00613-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 01/04/2022] [Accepted: 01/14/2022] [Indexed: 12/21/2022] Open
Abstract
Hippocampus is a part of the limbic system in human brain that plays an important role in forming memories and dealing with intellectual abilities. In most of the neurological disorders related to dementia, such as, Alzheimer's disease, hippocampus is one of the earliest affected regions. Because there are no effective dementia drugs, an ambient assisted living approach may help to prevent or slow the progression of dementia. By segmenting and analyzing the size/shape of hippocampus, it may be possible to classify the early dementia stages. Because of complex structure, traditional image segmentation techniques can't segment hippocampus accurately. Machine learning (ML) is a well known tool in medical image processing that can predict and deliver the outcomes accurately by learning from it's previous results. Convolutional Neural Networks (CNN) is one of the most popular ML algorithms. In this work, a U-Net Convolutional Network based approach is used for hippocampus segmentation from 2D brain images. It is observed that, the original U-Net architecture can segment hippocampus with an average performance rate of 93.6%, which outperforms all other discussed state-of-arts. By using a filter size of [Formula: see text], the original U-Net architecture performs a sequence of convolutional processes. We tweaked the architecture further to extract more relevant features by replacing all [Formula: see text] kernels with three alternative kernels of sizes [Formula: see text], [Formula: see text], and [Formula: see text]. It is observed that, the modified architecture achieved an average performance rate of 96.5%, which outperforms the original U-Net model convincingly.
Collapse
Affiliation(s)
- Ruhul Amin Hazarika
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, 793022, India.
| | - Arnab Kumar Maji
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, 793022, India
| | - Raplang Syiem
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, 793022, India
| | - Samarendra Nath Sur
- Department of Electronics and Communication Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Majitar, Sikkim, 737136, India
| | - Debdatta Kandar
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, 793022, India.
| |
Collapse
|
17
|
Tang S, Yu X, Cheang CF, Hu Z, Fang T, Choi IC, Yu HH. Diagnosis of Esophageal Lesions by Multi-Classification and Segmentation Using an Improved Multi-Task Deep Learning Model. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22041492. [PMID: 35214396 PMCID: PMC8876234 DOI: 10.3390/s22041492] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 01/26/2022] [Accepted: 02/08/2022] [Indexed: 05/03/2023]
Abstract
It is challenging for endoscopists to accurately detect esophageal lesions during gastrointestinal endoscopic screening due to visual similarities among different lesions in terms of shape, size, and texture among patients. Additionally, endoscopists are busy fighting esophageal lesions every day, hence the need to develop a computer-aided diagnostic tool to classify and segment the lesions at endoscopic images to reduce their burden. Therefore, we propose a multi-task classification and segmentation (MTCS) model, including the Esophageal Lesions Classification Network (ELCNet) and Esophageal Lesions Segmentation Network (ELSNet). The ELCNet was used to classify types of esophageal lesions, and the ELSNet was used to identify lesion regions. We created a dataset by collecting 805 esophageal images from 255 patients and 198 images from 64 patients to train and evaluate the MTCS model. Compared with other methods, the proposed not only achieved a high accuracy (93.43%) in classification but achieved a dice similarity coefficient (77.84%) in segmentation. In conclusion, the MTCS model can boost the performance of endoscopists in the detection of esophageal lesions as it can accurately multi-classify and segment the lesions and is a potential assistant for endoscopists to reduce the risk of oversight.
Collapse
Affiliation(s)
- Suigu Tang
- Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China; (S.T.); (X.Y.); (Z.H.); (T.F.)
| | - Xiaoyuan Yu
- Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China; (S.T.); (X.Y.); (Z.H.); (T.F.)
| | - Chak-Fong Cheang
- Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China; (S.T.); (X.Y.); (Z.H.); (T.F.)
- Correspondence:
| | - Zeming Hu
- Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China; (S.T.); (X.Y.); (Z.H.); (T.F.)
| | - Tong Fang
- Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China; (S.T.); (X.Y.); (Z.H.); (T.F.)
| | - I-Cheong Choi
- Kiang Wu Hospital, Macau 999078, China; (I.-C.C.); (H.-H.Y.)
| | - Hon-Ho Yu
- Kiang Wu Hospital, Macau 999078, China; (I.-C.C.); (H.-H.Y.)
| |
Collapse
|
18
|
Stochastic Analysis of Nonlinear Cancer Disease Model through Virotherapy and Computational Methods. MATHEMATICS 2022. [DOI: 10.3390/math10030368] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Cancer is a common term for many diseases that can affect anybody. A worldwide leading cause of death is cancer, according to the World Health Organization (WHO) report. In 2020, ten million people died from cancer. This model identifies the interaction of cancer cells, viral therapy, and immune response. In this model, the cell population has four parts, namely uninfected cells (x), infected cells (y), virus-free cells (v), and immune cells (z). This study presents the analysis of the stochastic cancer virotherapy model in the cell population dynamics. The model results have restored the properties of the biological problem, such as dynamical consistency, positivity, and boundedness, which are the considerable requirements of the models in these fields. The existing computational methods, such as the Euler Maruyama, Stochastic Euler, and Stochastic Runge Kutta, fail to restore the abovementioned properties. The proposed stochastic nonstandard finite difference method is efficient, cost-effective, and accommodates all the desired feasible properties. The existing standard stochastic methods converge conditionally or diverge in the long run. The solution by the nonstandard finite difference method is stable and convergent over all time steps.
Collapse
|
19
|
Guidance Image-Based Enhanced Matched Filter with Modified Thresholding for Blood Vessel Extraction. Symmetry (Basel) 2022. [DOI: 10.3390/sym14020194] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Fundus images have been established as an important factor in analyzing and recognizing many cardiovascular and ophthalmological diseases. Consequently, precise segmentation of blood using computer vision is vital in the recognition of ailments. Although clinicians have adopted computer-aided diagnostics (CAD) in day-to-day diagnosis, it is still quite difficult to conduct fully automated analysis based exclusively on information contained in fundus images. In fundus image applications, one of the methods for conducting an automatic analysis is to ascertain symmetry/asymmetry details from corresponding areas of the retina and investigate their association with positive clinical findings. In the field of diabetic retinopathy, matched filters have been shown to be an established technique for vessel extraction. However, there is reduced efficiency in matched filters due to noisy images. In this work, a joint model of a fast guided filter and a matched filter is suggested for enhancing abnormal retinal images containing low vessel contrasts. Extracting all information from an image correctly is one of the important factors in the process of image enhancement. A guided filter has an excellent property in edge-preserving, but still tends to suffer from halo artifacts near the edges. Fast guided filtering is a technique that subsamples the filtering input image and the guidance image and calculates the local linear coefficients for upsampling. In short, the proposed technique applies a fast guided filter and a matched filter for attaining improved performance measures for vessel extraction. The recommended technique was assessed on DRIVE and CHASE_DB1 datasets and achieved accuracies of 0.9613 and 0.960, respectively, both of which are higher than the accuracy of the original matched filter and other suggested vessel segmentation algorithms.
Collapse
|
20
|
Kumar Y, Koul A, Singla R, Ijaz MF. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2022; 14:8459-8486. [PMID: 35039756 PMCID: PMC8754556 DOI: 10.1007/s12652-021-03612-z] [Citation(s) in RCA: 117] [Impact Index Per Article: 58.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 11/18/2021] [Indexed: 05/03/2023]
Abstract
Artificial intelligence can assist providers in a variety of patient care and intelligent health systems. Artificial intelligence techniques ranging from machine learning to deep learning are prevalent in healthcare for disease diagnosis, drug discovery, and patient risk identification. Numerous medical data sources are required to perfectly diagnose diseases using artificial intelligence techniques, such as ultrasound, magnetic resonance imaging, mammography, genomics, computed tomography scan, etc. Furthermore, artificial intelligence primarily enhanced the infirmary experience and sped up preparing patients to continue their rehabilitation at home. This article covers the comprehensive survey based on artificial intelligence techniques to diagnose numerous diseases such as Alzheimer, cancer, diabetes, chronic heart disease, tuberculosis, stroke and cerebrovascular, hypertension, skin, and liver disease. We conducted an extensive survey including the used medical imaging dataset and their feature extraction and classification process for predictions. Preferred reporting items for systematic reviews and Meta-Analysis guidelines are used to select the articles published up to October 2020 on the Web of Science, Scopus, Google Scholar, PubMed, Excerpta Medical Database, and Psychology Information for early prediction of distinct kinds of diseases using artificial intelligence-based techniques. Based on the study of different articles on disease diagnosis, the results are also compared using various quality parameters such as prediction rate, accuracy, sensitivity, specificity, the area under curve precision, recall, and F1-score.
Collapse
Affiliation(s)
- Yogesh Kumar
- Department of Computer Engineering, Indus Institute of Technology and Engineering, Indus University, Ahmedabad, 382115 India
| | | | - Ruchi Singla
- Department of Research, Innovations, Sponsored Projects and Entrepreneurship, CGC Landran, Mohali, India
| | - Muhammad Fazal Ijaz
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, 05006 South Korea
| |
Collapse
|
21
|
Intelligent Anomaly Identification of Uplift Pressure Monitoring Data and Structural Diagnosis of Concrete Dam. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12020612] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
As an essential load of the concrete dam, the abnormal change of uplift pressure directly threatens the safety and stability of the concrete dam. Therefore, it is of great significance to accurately and efficiently excavate the hidden information of the uplift pressure monitoring data to clarify the safety state of the concrete dam. Therefore, in this paper, density-based spatial clustering of applications with noise (DBSCAN) method is used to intelligently identify the abnormal occurrence point and abnormal stable stage in the monitoring data. Then, an application method of measured uplift pressure is put forward to accurately reflect the spatial distribution and abnormal position of uplift pressure in the dam foundation. It is easy to calculate and connect with the finite element method through self-written software. Finally, the measured uplift pressure is applied to the finite element model of the concrete dam. By comparing the structural behavior of the concrete dam under the design and measured uplift pressure, the influence of abnormal uplift pressure on the safety state of the concrete dam is clarified, which can guide the project operation. Taking a 98.5 m concrete arch dam in western China as an example, the above analysis ideas and calculation methods have been verified. The abnormal identification method and uplift pressure applying method can provide ideas and tools for the structural diagnosis of a concrete dam.
Collapse
|
22
|
Zhang F, Li Q. Constructing ontologies by mining deep semantics from XML Schemas and XML instance documents. INT J INTELL SYST 2022. [DOI: 10.1002/int.22643] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- Fu Zhang
- School of Computer Science & Engineering Northeastern University Shenyang China
| | - Qiang Li
- School of Computer Science & Engineering Northeastern University Shenyang China
| |
Collapse
|
23
|
A Fuzzy Rule-Based System for Classification of Diabetes. SENSORS 2021; 21:s21238095. [PMID: 34884099 PMCID: PMC8659829 DOI: 10.3390/s21238095] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 11/27/2021] [Accepted: 11/28/2021] [Indexed: 12/26/2022]
Abstract
Diabetes is a fatal disease that currently has no treatment. However, early diagnosis of diabetes aids patients to start timely treatment and thus reduces or eliminates the risk of severe complications. The prevalence of diabetes has been rising rapidly worldwide. Several methods have been introduced to diagnose diabetes at an early stage, however, most of these methods lack interpretability, due to which the diagnostic process cannot be explained. In this paper, fuzzy logic has been employed to develop an interpretable model and to perform an early diagnosis of diabetes. Fuzzy logic has been combined with the cosine amplitude method, and two fuzzy classifiers have been constructed. Afterward, fuzzy rules have been designed based on these classifiers. Lastly, a publicly available diabetes dataset has been used to evaluate the performance of the proposed fuzzy rule-based model. The results show that the proposed model outperforms existing techniques by achieving an accuracy of 96.47%. The proposed model has demonstrated great prediction accuracy, suggesting that it can be utilized in the healthcare sector for the accurate diagnose of diabetes.
Collapse
|
24
|
Raza A, Awrejcewicz J, Rafiq M, Mohsin M. Breakdown of a Nonlinear Stochastic Nipah Virus Epidemic Models through Efficient Numerical Methods. ENTROPY (BASEL, SWITZERLAND) 2021; 23:1588. [PMID: 34945894 PMCID: PMC8700744 DOI: 10.3390/e23121588] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 11/17/2021] [Accepted: 11/23/2021] [Indexed: 12/25/2022]
Abstract
Background: Nipah virus (NiV) is a zoonotic virus (transmitted from animals to humans), which can also be transmitted through contaminated food or directly between people. According to a World Health Organization (WHO) report, the transmission of Nipah virus infection varies from animals to humans or humans to humans. The case fatality rate is estimated at 40% to 75%. The most infected regions include Cambodia, Ghana, Indonesia, Madagascar, the Philippines, and Thailand. The Nipah virus model is categorized into four parts: susceptible (S), exposed (E), infected (I), and recovered (R). Methods: The structural properties such as dynamical consistency, positivity, and boundedness are the considerable requirements of models in these fields. However, existing numerical methods like Euler-Maruyama and Stochastic Runge-Kutta fail to explain the main features of the biological problems. Results: The proposed stochastic non-standard finite difference (NSFD) employs standard and non-standard approaches in the numerical solution of the model, with positivity and boundedness as the characteristic determinants for efficiency and low-cost approximations. While the results from the existing standard stochastic methods converge conditionally or diverge in the long run, the solution by the stochastic NSFD method is stable and convergent over all time steps. Conclusions: The stochastic NSFD is an efficient, cost-effective method that accommodates all the desired feasible properties.
Collapse
Affiliation(s)
- Ali Raza
- Department of Mathematics, Govt. Maulana Zafar Ali Khan Graduate College Wazirabad, Punjab Higher Education Department (PHED), Lahore 54000, Pakistan;
| | - Jan Awrejcewicz
- Department of Automation, Biomechanics and Mechatronics, Lodz University of Technology, 1/15 Stefanowskiego St., 90-924 Lodz, Poland;
| | - Muhammad Rafiq
- Department of Mathematics, Faculty of Sciences, University of Central Punjab, Lahore 54600, Pakistan;
| | - Muhammad Mohsin
- Department of Mathematics, Technische Universitat Chemnitz, 62, 09111 Chemnitz, Germany
| |
Collapse
|
25
|
Zhang Z, Xiao T, Qin X. Fly visual evolutionary neural network solving large‐scale global optimization. INT J INTELL SYST 2021. [DOI: 10.1002/int.22564] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Zhuhong Zhang
- Department of Big Data Science and Engineering, College of Big Data and Information Engineering Guizhou University Guiyang Guizhou China
| | - Tianyu Xiao
- Guizhou Provincial Characteristic Key Laboratory of System Optimization and Scientific Computation Guizhou University Guiyang Guizhou China
| | - Xiuchang Qin
- Guizhou Provincial Characteristic Key Laboratory of System Optimization and Scientific Computation Guizhou University Guiyang Guizhou China
| |
Collapse
|
26
|
Zoumpekas T, Puig A, Salamó M, Garcı́a‐Sellés D, Blanco Nuñez L, Guinau M. An intelligent framework for end‐to‐end rockfall detection. INT J INTELL SYST 2021. [DOI: 10.1002/int.22557] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Thanasis Zoumpekas
- Department of Mathematics and Computer Science, WAI Research Group, IMUB and UBICS Institutes University of Barcelona Barcelona Spain
| | - Anna Puig
- Department of Mathematics and Computer Science, WAI Research Group, IMUB and UBICS Institutes University of Barcelona Barcelona Spain
| | - Maria Salamó
- Department of Mathematics and Computer Science, WAI Research Group, IMUB and UBICS Institutes University of Barcelona Barcelona Spain
| | - David Garcı́a‐Sellés
- Department of Earth and Ocean Dynamics, RISKNAT Research Group, Geomodels Institute University of Barcelona Barcelona Spain
| | - Laura Blanco Nuñez
- Department of Earth and Ocean Dynamics, GGAC Research Group, Geomodels Institute University of Barcelona Barcelona Spain
- Anufra—Soil and Water Consulting Barcelona Spain
| | - Marta Guinau
- Department of Earth and Ocean Dynamics, RISKNAT Research Group, Geomodels Institute University of Barcelona Barcelona Spain
| |
Collapse
|
27
|
A Hybrid Method to Enhance Thick and Thin Vessels for Blood Vessel Segmentation. Diagnostics (Basel) 2021; 11:diagnostics11112017. [PMID: 34829365 PMCID: PMC8621384 DOI: 10.3390/diagnostics11112017] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/25/2021] [Accepted: 10/25/2021] [Indexed: 11/16/2022] Open
Abstract
Retinal blood vessels have been presented to contribute confirmation with regard to tortuosity, branching angles, or change in diameter as a result of ophthalmic disease. Although many enhancement filters are extensively utilized, the Jerman filter responds quite effectively at vessels, edges, and bifurcations and improves the visualization of structures. In contrast, curvelet transform is specifically designed to associate scale with orientation and can be used to recover from noisy data by curvelet shrinkage. This paper describes a method to improve the performance of curvelet transform further. A distinctive fusion of curvelet transform and the Jerman filter is presented for retinal blood vessel segmentation. Mean-C thresholding is employed for the segmentation purpose. The suggested method achieves average accuracies of 0.9600 and 0.9559 for DRIVE and CHASE_DB1, respectively. Simulation results establish a better performance and faster implementation of the suggested scheme in comparison with similar approaches seen in the literature.
Collapse
|
28
|
Automated Segmentation of Median Nerve in Dynamic Sonography Using Deep Learning: Evaluation of Model Performance. Diagnostics (Basel) 2021; 11:diagnostics11101893. [PMID: 34679591 PMCID: PMC8534332 DOI: 10.3390/diagnostics11101893] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 10/01/2021] [Accepted: 10/10/2021] [Indexed: 11/21/2022] Open
Abstract
There is an emerging trend to employ dynamic sonography in the diagnosis of entrapment neuropathy, which exhibits aberrant spatiotemporal characteristics of the entrapped nerve when adjacent tissues move. However, the manual tracking of the entrapped nerve in consecutive images demands tons of human labors and impedes its popularity clinically. Here we evaluated the performance of automated median nerve segmentation in dynamic sonography using a variety of deep learning models pretrained with ImageNet, including DeepLabV3+, U-Net, FPN, and Mask-R-CNN. Dynamic ultrasound images of the median nerve at across wrist level were acquired from 52 subjects diagnosed as carpal tunnel syndrome when they moved their fingers. The videos of 16 subjects exhibiting diverse appearance and that of the remaining 36 subjects were used for model test and training, respectively. The centroid, circularity, perimeter, and cross section area of the median nerve in individual frame were automatically determined from the inferred nerve. The model performance was evaluated by the score of intersection over union (IoU) between the annotated and model-predicted data. We found that both DeepLabV3+ and Mask R-CNN predicted median nerve the best with averaged IOU scores close to 0.83, which indicates the feasibility of automated median nerve segmentation in dynamic sonography using deep learning.
Collapse
|
29
|
Mallika C, Selvamuthukumaran S. A Hybrid Crow Search and Grey Wolf Optimization Technique for Enhanced Medical Data Classification in Diabetes Diagnosis System. INT J COMPUT INT SYS 2021. [DOI: 10.1007/s44196-021-00013-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
AbstractDiabetes is an extremely serious hazard to global health and its incidence is increasing vividly. In this paper, we develop an effective system to diagnose diabetes disease using a hybrid optimization-based Support Vector Machine (SVM).The proposed hybrid optimization technique integrates a Crow Search algorithm (CSA) and Binary Grey Wolf Optimizer (BGWO) for exploiting the full potential of SVM in the diabetes diagnosis system. The effectiveness of our proposed hybrid optimization-based SVM (hereafter called CS-BGWO-SVM) approach is carefully studied on the real-world databases such as UCIPima Indian standard dataset and the diabetes type dataset from the Data World repository. To evaluate the CS-BGWO-SVM technique, its performance is related to several state-of-the-arts approaches using SVM with respect to predictive accuracy, Intersection Over-Union (IoU), specificity, sensitivity, and the area under receiver operator characteristic curve (AUC). The outcomes of empirical analysis illustrate that CS-BGWO-SVM can be considered as a more efficient approach with outstanding classification accuracy. Furthermore, we perform the Wilcoxon statistical test to decide whether the proposed cohesive CS-BGWO-SVM approach offers a substantial enhancement in terms of performance measures or not. Consequently, we can conclude that CS-BGWO-SVM is the better diabetes diagnostic model as compared to modern diagnosis methods previously reported in the literature.
Collapse
|
30
|
Xu Z, Shen D, Nie T, Kou Y, Yin N, Han X. A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.02.056] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
31
|
Zhao H, Liu Z, Li M, Liang L. Healthcare Warranty Policies Optimization for Chronic Diseases Based on Delay Time Concept. Healthcare (Basel) 2021; 9:healthcare9081088. [PMID: 34442225 PMCID: PMC8392548 DOI: 10.3390/healthcare9081088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Revised: 08/19/2021] [Accepted: 08/20/2021] [Indexed: 11/16/2022] Open
Abstract
Warranties for healthcare can be greatly beneficial for cost reductions and improvements in patient satisfaction. Under healthcare warranties, healthcare providers receive a lump sum payment for the entire care episode, which covers a bundle of healthcare services, including treatment decisions during initial hospitalization and subsequent readmissions, as well as disease-monitoring plans composed of periodic follow-ups. Higher treatment intensities and more radical monitoring strategies result in higher medical costs, but high treatment intensities reduce the baseline readmission rates. This study intends to provide a systematic optimization framework for healthcare warranty policies. In this paper, the proposed model allows healthcare providers to determine the optimal combination of treatment decisions and disease-monitoring policies to minimize the total expected healthcare warranty cost over the prespecified period. Given the nature of the disease progression, we introduced a delay time model to simulate the progression of chronic diseases. Based on this, we formulated an accumulated age model to measure the effect of follow-up on the patient's readmission risk. By means of the proposed model, the optimal treatment intensity and the monitoring policy can be derived. A case study of pediatric type 1 diabetes mellitus is presented to illustrate the applicability of the proposed model. The findings could form the basis of developing effective healthcare warranty policies for patients with chronic diseases.
Collapse
Affiliation(s)
- Heng Zhao
- College of Management and Economics, Tianjin University, Tianjin 300072, China; (H.Z.); (Z.L.); (M.L.)
| | - Zixian Liu
- College of Management and Economics, Tianjin University, Tianjin 300072, China; (H.Z.); (Z.L.); (M.L.)
| | - Mei Li
- College of Management and Economics, Tianjin University, Tianjin 300072, China; (H.Z.); (Z.L.); (M.L.)
| | - Lijun Liang
- School of Management, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
- Correspondence:
| |
Collapse
|
32
|
Gupta D, Choudhury A, Gupta U, Singh P, Prasad M. Computational approach to clinical diagnosis of diabetes disease: a comparative study. MULTIMEDIA TOOLS AND APPLICATIONS 2021; 80:30091-30116. [DOI: 10.1007/s11042-020-10242-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 10/14/2020] [Accepted: 12/09/2020] [Indexed: 08/30/2023]
|
33
|
Qualitative Data Clustering to Detect Outliers. ENTROPY 2021; 23:e23070869. [PMID: 34356410 PMCID: PMC8307081 DOI: 10.3390/e23070869] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 06/26/2021] [Accepted: 07/01/2021] [Indexed: 11/17/2022]
Abstract
Detecting outliers is a widely studied problem in many disciplines, including statistics, data mining, and machine learning. All anomaly detection activities are aimed at identifying cases of unusual behavior compared to most observations. There are many methods to deal with this issue, which are applicable depending on the size of the data set, the way it is stored, and the type of attributes and their values. Most of them focus on traditional datasets with a large number of quantitative attributes. The multitude of solutions related to detecting outliers in quantitative sets, a large and still has a small number of research solutions is a problem detecting outliers in data containing only qualitative variables. This article was designed to compare three different categorical data clustering algorithms: K-modes algorithm taken from MacQueen’s K-means algorithm and the STIRR and ROCK algorithms. The comparison concerned the method of dividing the set into clusters and, in particular, the outliers detected by algorithms. During the research, the authors analyzed the clusters detected by the indicated algorithms, using several datasets that differ in terms of the number of objects and variables. They have conducted experiments on the parameters of the algorithms. The presented study made it possible to check whether the algorithms similarly detect outliers in the data and how much they depend on individual parameters and parameters of the set, such as the number of variables, tuples, and categories of a qualitative variable.
Collapse
|
34
|
Multiclassification of Endoscopic Colonoscopy Images Based on Deep Transfer Learning. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:2485934. [PMID: 34306173 PMCID: PMC8272675 DOI: 10.1155/2021/2485934] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/27/2021] [Accepted: 06/09/2021] [Indexed: 11/17/2022]
Abstract
With the continuous improvement of human living standards, dietary habits are constantly changing, which brings various bowel problems. Among them, the morbidity and mortality rates of colorectal cancer have maintained a significant upward trend. In recent years, the application of deep learning in the medical field has become increasingly spread aboard and deep. In a colonoscopy, Artificial Intelligence based on deep learning is mainly used to assist in the detection of colorectal polyps and the classification of colorectal lesions. But when it comes to classification, it can lead to confusion between polyps and other diseases. In order to accurately diagnose various diseases in the intestines and improve the classification accuracy of polyps, this work proposes a multiclassification method for medical colonoscopy images based on deep learning, which mainly classifies the four conditions of polyps, inflammation, tumor, and normal. In view of the relatively small number of data sets, the network firstly trained by transfer learning on ImageNet was used as the pretraining model, and the prior knowledge learned from the source domain learning task was applied to the classification task about intestinal illnesses. Then, we fine-tune the model to make it more suitable for the task of intestinal classification by our data sets. Finally, the model is applied to the multiclassification of medical colonoscopy images. Experimental results show that the method in this work can significantly improve the recognition rate of polyps while ensuring the classification accuracy of other categories, so as to assist the doctor in the diagnosis of surgical resection.
Collapse
|
35
|
Hussain L, Huang P, Nguyen T, Lone KJ, Ali A, Khan MS, Li H, Suh DY, Duong TQ. Machine learning classification of texture features of MRI breast tumor and peri-tumor of combined pre- and early treatment predicts pathologic complete response. Biomed Eng Online 2021; 20:63. [PMID: 34183038 PMCID: PMC8240261 DOI: 10.1186/s12938-021-00899-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 06/09/2021] [Indexed: 12/02/2022] Open
Abstract
Purpose This study used machine learning classification of texture features from MRI of breast tumor and peri-tumor at multiple treatment time points in conjunction with molecular subtypes to predict eventual pathological complete response (PCR) to neoadjuvant chemotherapy. Materials and method This study employed a subset of patients (N = 166) with PCR data from the I-SPY-1 TRIAL (2002–2006). This cohort consisted of patients with stage 2 or 3 breast cancer that underwent anthracycline–cyclophosphamide and taxane treatment. Magnetic resonance imaging (MRI) was acquired pre-neoadjuvant chemotherapy, early, and mid-treatment. Texture features were extracted from post-contrast-enhanced MRI, pre- and post-contrast subtraction images, and with morphological dilation to include peri-tumoral tissue. Molecular subtypes and Ki67 were also included in the prediction model. Performance of classification models used the receiver operating characteristics curve analysis including area under the curve (AUC). Statistical analysis was done using unpaired two-tailed t-tests. Results Molecular subtypes alone yielded moderate prediction performance of PCR (AUC = 0.82, p = 0.07). Pre-, early, and mid-treatment data alone yielded moderate performance (AUC = 0.88, 0.72, and 0.78, p = 0.03, 0.13, 0.44, respectively). The combined pre- and early treatment data markedly improved performance (AUC = 0.96, p = 0.0003). Addition of molecular subtypes improved performance slightly for individual time points but substantially for the combined pre- and early treatment (AUC = 0.98, p = 0.0003). The optimal morphological dilation was 3–5 pixels. Subtraction of post- and pre-contrast MRI further improved performance (AUC = 0.98, p = 0.00003). Finally, among the machine-learning algorithms evaluated, the RUSBoosted Tree machine-learning method yielded the highest performance. Conclusion AI-classification of texture features from MRI of breast tumor at multiple treatment time points accurately predicts eventual PCR. Longitudinal changes in texture features and peri-tumoral features further improve PCR prediction performance. Accurate assessment of treatment efficacy early on could minimize unnecessary toxic chemotherapy and enable mid-treatment modification for patients to achieve better clinical outcomes.
Collapse
Affiliation(s)
- Lal Hussain
- Department of Computer Science & IT, Neelum Campus, The University of Azad Jammu and Kashmir, Muzaffarabad, Azad Kashmir, Pakistan.,Department of Computer Science & IT, King Abdullah Campus, The University of Azad Jammu and Kashmir, Muzaffarabad, Azad Kashmir, Pakistan.,Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA.,Department of Radiology, Albert Einstein College of Medicine and Montefiore Medical Center, 111 East 210th Street, Bronx, NY, 10467, USA
| | - Pauline Huang
- Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Tony Nguyen
- Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Kashif J Lone
- Department of Computer Science & IT, King Abdullah Campus, The University of Azad Jammu and Kashmir, Muzaffarabad, Azad Kashmir, Pakistan
| | - Amjad Ali
- Department of Computer Science, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
| | - Muhammad Salman Khan
- Department of Computer Science, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
| | - Haifang Li
- Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Doug Young Suh
- College of Electronics and Convergence Engineering, Kyung Hee University, Seoul, South Korea.
| | - Tim Q Duong
- Department of Radiology, Albert Einstein College of Medicine and Montefiore Medical Center, 111 East 210th Street, Bronx, NY, 10467, USA
| |
Collapse
|
36
|
|
37
|
Messina D, Borrelli P, Russo P, Salvatore M, Aiello M. Voxel-Wise Feature Selection Method for CNN Binary Classification of Neuroimaging Data. Front Neurosci 2021; 15:630747. [PMID: 33958980 PMCID: PMC8093438 DOI: 10.3389/fnins.2021.630747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 02/26/2021] [Indexed: 11/23/2022] Open
Abstract
Voxel-wise group analysis is presented as a novel feature selection (FS) technique for a deep learning (DL) approach to brain imaging data classification. The method, based on a voxel-wise two-sample t-test and denoted as t-masking, is integrated into the learning procedure as a data-driven FS strategy. t-Masking has been introduced in a convolutional neural network (CNN) for the test bench of binary classification of very-mild Alzheimer’s disease vs. normal control, using a structural magnetic resonance imaging dataset of 180 subjects. To better characterize the t-masking impact on CNN classification performance, six different experimental configurations were designed. Moreover, the performances of the presented FS method were compared to those of similar machine learning (ML) models that relied on different FS approaches. Overall, our results show an enhancement of about 6% in performance when t-masking was applied. Moreover, the reported performance enhancement was higher with respect to similar FS-based ML models. In addition, evaluation of the impact of t-masking on various selection rates has been provided, serving as a useful characterization for future insights. The proposed approach is also highly generalizable to other DL architectures, neuroimaging modalities, and brain pathologies.
Collapse
Affiliation(s)
| | | | - Paolo Russo
- Dipartimento di Fisica "Ettore Pancini", Università Degli Studi di Napoli "Federico II" - Complesso Universitario di Monte Sant'Angelo, Naples, Italy
| | | | | |
Collapse
|
38
|
Srinivasu PN, SivaSai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ. Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM. SENSORS (BASEL, SWITZERLAND) 2021; 21:2852. [PMID: 33919583 PMCID: PMC8074091 DOI: 10.3390/s21082852] [Citation(s) in RCA: 143] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 04/08/2021] [Accepted: 04/16/2021] [Indexed: 12/18/2022]
Abstract
Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region's image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.
Collapse
Affiliation(s)
- Parvathaneni Naga Srinivasu
- Department of Computer Science and Engineering, Gitam Institute of Technology, GITAM Deemed to be University, Rushikonda, Visakhapatnam 530045, India;
| | | | - Muhammad Fazal Ijaz
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Korea;
| | - Akash Kumar Bhoi
- Department of Electrical and Electronics Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Majitar 737136, India;
| | - Wonjoon Kim
- Division of Future Convergence (HCI Science Major), Dongduk Women’s University, Seoul 02748, Korea
| | - James Jin Kang
- School of Science, Edith Cowan University, Joondalup 6027, Australia
| |
Collapse
|
39
|
Vuttipittayamongkol P, Elyan E, Petrovski A. On the class overlap problem in imbalanced data classification. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106631] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
40
|
eGAP: An Evolutionary Game Theoretic Approach to Random Forest Pruning. BIG DATA AND COGNITIVE COMPUTING 2020. [DOI: 10.3390/bdcc4040037] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
To make healthcare available and easily accessible, the Internet of Things (IoT), which paved the way to the construction of smart cities, marked the birth of many smart applications in numerous areas, including healthcare. As a result, smart healthcare applications have been and are being developed to provide, using mobile and electronic technology, higher diagnosis quality of the diseases, better treatment of the patients, and improved quality of lives. Since smart healthcare applications that are mainly concerned with the prediction of healthcare data (like diseases for example) rely on predictive healthcare data analytics, it is imperative for such predictive healthcare data analytics to be as accurate as possible. In this paper, we will exploit supervised machine learning methods in classification and regression to improve the performance of the traditional Random Forest on healthcare datasets, both in terms of accuracy and classification/regression speed, in order to produce an effective and efficient smart healthcare application, which we have termed eGAP. eGAP uses the evolutionary game theoretic approach replicator dynamics to evolve a Random Forest ensemble. Trees of high resemblance in an initial Random Forest are clustered, and then clusters grow and shrink by adding and removing trees using replicator dynamics, according to the predictive accuracy of each subforest represented by a cluster of trees. All clusters have an initial number of trees that is equal to the number of trees in the smallest cluster. Cluster growth is performed using trees that are not initially sampled. The speed and accuracy of the proposed method have been demonstrated by an experimental study on 10 classification and 10 regression medical datasets.
Collapse
|
41
|
AlJame M, Ahmad I, Imtiaz A, Mohammed A. Ensemble learning model for diagnosing COVID-19 from routine blood tests. INFORMATICS IN MEDICINE UNLOCKED 2020; 21:100449. [PMID: 33102686 PMCID: PMC7572278 DOI: 10.1016/j.imu.2020.100449] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 09/28/2020] [Accepted: 10/07/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND AND OBJECTIVES The pandemic of novel coronavirus disease 2019 (COVID-19) has severely impacted human society with a massive death toll worldwide. There is an urgent need for early and reliable screening of COVID-19 patients to provide better and timely patient care and to combat the spread of the disease. In this context, recent studies have reported some key advantages of using routine blood tests for initial screening of COVID-19 patients. In this article, first we present a review of the emerging techniques for COVID-19 diagnosis using routine laboratory and/or clinical data. Then, we propose ERLX which is an ensemble learning model for COVID-19 diagnosis from routine blood tests. METHOD The proposed model uses three well-known diverse classifiers, extra trees, random forest and logistic regression, which have different architectures and learning characteristics at the first level, and then combines their predictions by using a second level extreme gradient boosting (XGBoost) classifier to achieve a better performance. For data preparation, the proposed methodology employs a KNNImputer algorithm to handle null values in the dataset, isolation forest (iForest) to remove outlier data, and a synthetic minority oversampling technique (SMOTE) to balance data distribution. For model interpretability, features importance are reported by using the SHapley Additive exPlanations (SHAP) technique. RESULTS The proposed model was trained and evaluated by using a publicly available data set from Albert Einstein Hospital in Brazil, which consisted of 5644 data samples with 559 confirmed COVID-19 cases. The ensemble model achieved outstanding performance with an overall accuracy of 99.88% [95% CI: 99.6-100], AUC of 99.38% [95% CI: 97.5-100], a sensitivity of 98.72% [95% CI: 94.6-100] and a specificity of 99.99% [95% CI: 99.99-100]. DISCUSSION The proposed model revealed better performance when compared against existing state-of-the-art studies (Banerjee et al., 2020; de Freitas Barbosa et al., 2020; de Moraes Batista et al., 2020; Soares et al., 2020) [3,22,56,71] for the same set of features employed by them. As compared to the best performing Bayes Net model (de Freitas Barbosa et al., 2020) [22] average accuracy of 95.159%, ERLX achieved an average accuracy of 99.94%. In comparison with AUC of 85% reported by the SVM model (de Moraes Batista et al., 2020) [56], ERLX obtained AUC of 99.77% in addition to improvements in sensitivity, and specificity. As compared with ER-COV model (Soares et al., 2020) [71] average sensitivity of 70.25% and specificity of 85.98%, ERLX model achieved sensitivity of 99.47% and specificity of 99.99%. The ERLX model obtained a considerably higher score as compared with ANN model (Banerjee et al., 2020) [3] in all performance metrics. Therefore, the model presented is robust and can be deployed for reliable early and rapid screening of COVID-19 patients.
Collapse
Affiliation(s)
- Maryam AlJame
- Computer Engineering Department, Kuwait University, Kuwait
| | - Imtiaz Ahmad
- Computer Engineering Department, Kuwait University, Kuwait
| | | | - Ameer Mohammed
- Computer Engineering Department, Kuwait University, Kuwait
| |
Collapse
|
42
|
Ijaz MF, Attique M, Son Y. Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods. SENSORS 2020; 20:s20102809. [PMID: 32429090 PMCID: PMC7284557 DOI: 10.3390/s20102809] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 05/11/2020] [Accepted: 05/13/2020] [Indexed: 12/29/2022]
Abstract
Globally, cervical cancer remains as the foremost prevailing cancer in females. Hence, it is necessary to distinguish the importance of risk factors of cervical cancer to classify potential patients. The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs. The CCPM first removes outliers by using outlier detection methods such as density-based spatial clustering of applications with noise (DBSCAN) and isolation forest (iForest) and by increasing the number of cases in the dataset in a balanced way, for example, through synthetic minority over-sampling technique (SMOTE) and SMOTE with Tomek link (SMOTETomek). Finally, it employs random forest (RF) as a classifier. Thus, CCPM lies on four scenarios: (1) DBSCAN + SMOTETomek + RF, (2) DBSCAN + SMOTE+ RF, (3) iForest + SMOTETomek + RF, and (4) iForest + SMOTE + RF. A dataset of 858 potential patients was used to validate the performance of the proposed method. We found that combinations of iForest with SMOTE and iForest with SMOTETomek provided better performances than those of DBSCAN with SMOTE and DBSCAN with SMOTETomek. We also observed that RF performed the best among several popular machine learning classifiers. Furthermore, the proposed CCPM showed better accuracy than previously proposed methods for forecasting cervical cancer. In addition, a mobile application that can collect cervical cancer risk factors data and provides results from CCPM is developed for instant and proper action at the initial stage of cervical cancer.
Collapse
Affiliation(s)
- Muhammad Fazal Ijaz
- Department of Industrial and Systems Engineering, Dongguk University-Seoul, Seoul 04620, Korea;
| | | | - Youngdoo Son
- Department of Industrial and Systems Engineering, Dongguk University-Seoul, Seoul 04620, Korea;
- Correspondence: ; Tel.: +82-2-2260-3840
| |
Collapse
|
43
|
Classification of Guillain–Barré Syndrome Subtypes Using Sampling Techniques with Binary Approach. Symmetry (Basel) 2020. [DOI: 10.3390/sym12030482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Guillain–Barré Syndrome (GBS) is an unusual disorder where the body’s immune system affects the peripheral nervous system. GBS has four main subtypes, whose treatments vary among them. Severe cases of GBS can be fatal. This work aimed to investigate whether balancing an original GBS dataset improves the predictive models created in a previous study. purpleBalancing a dataset is to pursue symmetry in the number of instances of each of the classes.The dataset includes 129 records of Mexican patients diagnosed with some subtype of GBS. We created 10 binary datasets from the original dataset. Then, we balanced these datasets using four different methods to undersample the majority class and one method to oversample the minority class. Finally, we used three classifiers with different approaches to creating predictive models. The results show that balancing the original dataset improves the previous predictive models. The goal of the predictive models is to identify the GBS subtypes applying Machine Learning algorithms. It is expected that specialists may use the model to have a complementary diagnostic using a reduced set of relevant features. Early identification of the subtype will allow starting with the appropriate treatment for patient recovery. This is a contribution to exploring the performance of balancing techniques with real data.
Collapse
|
44
|
Ambika M, Raghuraman G, SaiRamesh L, Ayyasamy A. Intelligence – based decision support system for diagnosing the incidence of hypertensive type. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-190143] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- M. Ambika
- Department of Computer Science and Engineering, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India
| | - G. Raghuraman
- Department of Computer Science and Engineering, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India
| | - L. SaiRamesh
- Department of Information Science and Technology, CEG, Anna University Chennai, Tamil Nadu, India
| | - A. Ayyasamy
- Department of Computer Science and Engineering, Faculty of Engineering and Technology, Annamalai University, Tamil Nadu, India
| |
Collapse
|
45
|
Lin CH, Hsu KC, Johnson KR, Luby M, Fann YC. Applying density-based outlier identifications using multiple datasets for validation of stroke clinical outcomes. Int J Med Inform 2019; 132:103988. [PMID: 31590140 DOI: 10.1016/j.ijmedinf.2019.103988] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 08/20/2019] [Accepted: 10/01/2019] [Indexed: 12/13/2022]
Abstract
INTRODUCTION Clinicians commonly use the modified Rankin Scale (mRS) and the Barthel Index (BI) to measure clinical outcome after stroke. These are potential targets in machine learning models for stroke outcome prediction. Therefore, the quality of the measurements is crucial for training and validation of these models. The objective of this study was to apply and evaluate density-based outlier detection methods for identifying potentially incorrect measurements in multiple large stroke datasets to assess the measurement quality. METHOD We applied three density-based outlier detection methods including density-based spatial clustering of applications (DBSCAN), hierarchical DBSCAN (HDBSCAN) and local outlier factor (LOF) based on a large dataset obtained from a nationwide prospective stroke registry in Taiwan. The testing of each method was done by using four different NINDS funded stroke datasets. RESULT The DBSCAN achieved a high performance across all mRS values where the highest average accuracy was 99.2 ± 0.7 at mRS of 4 and the lowest average accuracy was 92.0 ± 4.6 at mRS of 3. The LOF also achieved similar performance, however, the HDBSCAN with default parameters setting required further tuning improvement. CONCLUSION The density-based outlier detection methods were proven to be promising for validation of stroke outcome measures. The outlier detection algorithm developed from a large prospective registry dataset was effectively applied in four different NINDS stroke datasets with high performance results. The tool developed from this detection algorithm can be further applied to real world datasets to increase the data quality in stroke outcome measures.
Collapse
Affiliation(s)
- Ching-Heng Lin
- Center for Information Technology, National Institutes of Health, Bethesda, MD, United States
| | - Kai-Cheng Hsu
- Bioinformatics Section, National Institute of Neurological Disorder and Stroke, National Institutes of Health, Bethesda, MD, United States; Department of Neurology, National Taiwan University Hospital, Taipei, Taiwan
| | - Kory R Johnson
- Bioinformatics Section, National Institute of Neurological Disorder and Stroke, National Institutes of Health, Bethesda, MD, United States
| | - Marie Luby
- Stroke Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States
| | - Yang C Fann
- Bioinformatics Section, National Institute of Neurological Disorder and Stroke, National Institutes of Health, Bethesda, MD, United States.
| |
Collapse
|
46
|
A fast and accurate approach for bankruptcy forecasting using squared logistics loss with GPU-based extreme gradient boosting. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2019.04.060] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
47
|
DBSCAN-Based Thermal Runaway Diagnosis of Battery Systems for Electric Vehicles. ENERGIES 2019. [DOI: 10.3390/en12152977] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Battery system diagnosis and prognosis are essential for ensuring the safe operation of electric vehicles (EVs). This paper proposes a diagnosis method of thermal runaway for ternary lithium-ion battery systems based on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering. Two-dimensional fault characteristics are first extracted according to battery voltage, and DBSCAN clustering is used to diagnose the potential thermal runaway cells (PTRC). The periodic risk assessing strategy is put forward to evaluate the fault risk of battery cells. The feasibility, reliability, stability, necessity, and robustness of the proposed algorithm are analyzed, and its effectiveness is verified based on datasets collected from real-world operating electric vehicles. The results show that the proposed method can accurately predict the locations of PTRC in the battery pack a few days before the thermal runaway occurrence.
Collapse
|
48
|
Data Analytics in Smart Healthcare: The Recent Developments and Beyond. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9142812] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The concepts of the smart city and the Internet of Things (IoT) have been facilitating the rollout of medical devices and systems to capture valuable information of humanity. A lot of artificial intelligence techniques have been demonstrated to be effective in smart city applications like energy, transportation, retail and control. In recent decade, retardation of the adoption of data analytics algorithms and systems in healthcare has been decreasing, and there is tremendous growth in data analytics research on healthcare data. The results of analytics aim at improving people’s quality of life as well as relieving the issue of medical shortages. In this special issue “Data Analytics in Smart Healthcare”, thirteen (13) papers have been published as the representative examples of recent developments. Guest Editors also highlight some emergent topics and opening challenges in healthcare analytics which follow the visions of the movement of healthcare analytics research.
Collapse
|
49
|
G. SS, K. M. Diagnosis of diabetes diseases using optimized fuzzy rule set by grey wolf optimization. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.06.005] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
50
|
An Accurate Clinical Implication Assessment for Diabetes Mellitus Prevalence Based on a Study from Nigeria. Processes (Basel) 2019. [DOI: 10.3390/pr7050289] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The increasing rate of diabetes is found across the planet. Therefore, the diagnosis of pre-diabetes and diabetes is important in populations with extreme diabetes risk. In this study, a machine learning technique was implemented over a data mining platform by employing Rule classifiers (PART and Decision table) to measure the accuracy and logistic regression on the classification results for forecasting the prevalence in diabetes mellitus patients suffering simultaneously from other chronic disease symptoms. The real-life data was collected in Nigeria between December 2017 and February 2019 by applying ten non-intrusive and easily available clinical variables. The results disclosed that the Rule classifiers achieved a mean accuracy of 98.75%. The error rate, precision, recall, F-measure, and Matthew’s correlation coefficient MCC were 0.02%, 0.98%, 0.98%, 0.98%, and 0.97%, respectively. The forecast decision, achieved by employing a set of 23 decision rules (DR), indicates that age, gender, glucose level, and body mass are fundamental reasons for diabetes, followed by work stress, diet, family diabetes history, physical exercise, and cardiovascular stroke history. The study validated that the proposed set of DR is practical for quick screening of diabetes mellitus patients at the initial stage without intrusive medical tests and was found to be effective in the initial diagnosis of diabetes.
Collapse
|