1
|
Chen L, Ji P, Ma Y, Rong Y, Ren J. Custom machine learning algorithm for large-scale disease screening - taking heart disease data as an example. Artif Intell Med 2023; 146:102688. [PMID: 38042606 DOI: 10.1016/j.artmed.2023.102688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 10/09/2023] [Accepted: 10/13/2023] [Indexed: 12/04/2023]
Abstract
Heart disease accounts for millions of deaths worldwide annually, representing a major public health concern. Large-scale heart disease screening can yield significant benefits both in terms of lives saved and economic costs. In this study, we introduce a novel algorithm that trains a patient-specific machine learning model, aligning with the real-world demands of extensive disease screening. Customization is achieved by concentrating on three key aspects: data processing, neural network architecture, and loss function formulation. Our approach integrates individual patient data to bolster model accuracy, ensuring dependable disease detection. We assessed our models using two prominent heart disease datasets: the Cleveland dataset and the UC Irvine (UCI) combination dataset. Our models showcased notable results, achieving accuracy and recall rates beyond 95 % for the Cleveland dataset and surpassing 97 % accuracy for the UCI dataset. Moreover, in terms of medical ethics and operability, our approach outperformed traditional, general-purpose machine learning algorithms. Our algorithm provides a powerful tool for large-scale disease screening and has the potential to save lives and reduce the economic burden of heart disease.
Collapse
Affiliation(s)
- Leran Chen
- Southern University of Science and Technology, Department of Mechanical and Energy Engineering, Shenzhen, China; The Hong Kong Polytechnic University, Department of Industrial and Systems Engineering, Hong Kong, China.
| | - Ping Ji
- Khalifa University, Department of Management Science And Engineering, Abu Dhabi, UAE.
| | - Yongsheng Ma
- Southern University of Science and Technology, Department of Mechanical and Energy Engineering, Shenzhen, China.
| | - Yiming Rong
- Southern University of Science and Technology, Department of Mechanical and Energy Engineering, Shenzhen, China.
| | - Jingzheng Ren
- The Hong Kong Polytechnic University, Department of Industrial and Systems Engineering, Hong Kong, China.
| |
Collapse
|
2
|
Khan Mamun MMR, Elfouly T. Detection of Cardiovascular Disease from Clinical Parameters Using a One-Dimensional Convolutional Neural Network. Bioengineering (Basel) 2023; 10:796. [PMID: 37508823 PMCID: PMC10376462 DOI: 10.3390/bioengineering10070796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/29/2023] [Accepted: 06/30/2023] [Indexed: 07/30/2023] Open
Abstract
Heart disease is a significant public health problem, and early detection is crucial for effective treatment and management. Conventional and noninvasive techniques are cumbersome, time-consuming, inconvenient, expensive, and unsuitable for frequent measurement or diagnosis. With the advance of artificial intelligence (AI), new invasive techniques emerging in research are detecting heart conditions using machine learning (ML) and deep learning (DL). Machine learning models have been used with the publicly available dataset from the internet about heart health; in contrast, deep learning techniques have recently been applied to analyze electrocardiograms (ECG) or similar vital data to detect heart diseases. Significant limitations of these datasets are their small size regarding the number of patients and features and the fact that many are imbalanced datasets. Furthermore, the trained models must be more reliable and accurate in medical settings. This study proposes a hybrid one-dimensional convolutional neural network (1D CNN), which uses a large dataset accumulated from online survey data and selected features using feature selection algorithms. The 1D CNN proved to show better accuracy compared to contemporary machine learning algorithms and artificial neural networks. The non-coronary heart disease (no-CHD) and CHD validation data showed an accuracy of 80.1% and 76.9%, respectively. The model was compared with an artificial neural network, random forest, AdaBoost, and a support vector machine. Overall, 1D CNN proved to show better performance in terms of accuracy, false negative rates, and false positive rates. Similar strategies were applied for four more heart conditions, and the analysis proved that using the hybrid 1D CNN produced better accuracy.
Collapse
Affiliation(s)
| | - Tarek Elfouly
- Department of Electrical and Computer Engineering, Tennessee Technological University, Cookeville, TN 38505, USA
| |
Collapse
|
3
|
Garabaghi FH, Benzer S, Benzer R. Modeling dissolved oxygen concentration using machine learning techniques with dimensionality reduction approach. ENVIRONMENTAL MONITORING AND ASSESSMENT 2023; 195:879. [PMID: 37354319 DOI: 10.1007/s10661-023-11492-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 06/10/2023] [Indexed: 06/26/2023]
Abstract
Oxygen is crucial to keep the life cycle balance in any aspect. Aquatic life is highly influenced by the levels of dissolved oxygen (DO). This calls for not just constant monitoring of the DO in aquatic systems, but to generate an accurate prediction model for future levels of the DO. This study aims to propose an accurate prediction model for DO concentrations. The performance of the Random Forest (RF) and multilayer perceptron (MLP) algorithms was evaluated in generating the regression models. Moreover, the effect of dimensionality reduction of the data by the wrapper feature Selection method on the performance of the models was evaluated. The results showed that the RF regressor excelled MLP in performance with both the dataset of all variables and the dataset of reduced variables with the best performance achieved by the RF regressor by considering Pearson correlation coefficient (0.8052), Mean absolute error (0.8911), and root mean square error (1.2805) when trained by the dataset of reduced variables. As for the accuracy of the models, the estimation error deviation of both models declined significantly when trained by the reduced variables. When the accuracy of the prediction was increased by 0.95% by the RF regressor, the accuracy of the MLP was incremented by 5.7% when trained by the dataset of reduced variables. The results demonstrated the positive impact of the dimensionality reduction on the accuracy of both models. However, RF can be considered a robust regressor in predicting DO concentrations.
Collapse
Affiliation(s)
| | - Semra Benzer
- Department of Science, Gazi University, Teknikokullar, 06500, Turkey
| | - Recep Benzer
- Department of Management Information System, Başkent University, Bağlıca, 06790, Turkey
| |
Collapse
|
4
|
Ahsan MM, Siddique Z. Machine learning-based heart disease diagnosis: A systematic literature review. Artif Intell Med 2022; 128:102289. [DOI: 10.1016/j.artmed.2022.102289] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/22/2022] [Indexed: 01/01/2023]
|
5
|
Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Khanna NN, Ruzsa Z, Sharma AM, Saxena S, Faa G, Laird JR, Johri AM, Kalra MK, Paraskevas KI, Saba L. A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review. Diagnostics (Basel) 2022; 12:722. [PMID: 35328275 PMCID: PMC8947682 DOI: 10.3390/diagnostics12030722] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/10/2022] [Accepted: 03/13/2022] [Indexed: 12/16/2022] Open
Abstract
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks.
Collapse
Affiliation(s)
- Jasjit S. Suri
- Stroke Diagnostic and Monitoring Division, AtheroPoint™, Roseville, CA 95661, USA
| | - Mrinalini Bhagawati
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Sudip Paul
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Athanasios D. Protogerou
- Research Unit Clinic, Laboratory of Pathophysiology, Department of Cardiovascular Prevention, National and Kapodistrian University of Athens, 11527 Athens, Greece;
| | - Petros P. Sfikakis
- Rheumatology Unit, National Kapodistrian University of Athens, 11527 Athens, Greece;
| | - George D. Kitas
- Arthritis Research UK Centre for Epidemiology, Manchester University, Manchester 46962, UK;
| | - Narendra N. Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi 110020, India;
| | - Zoltan Ruzsa
- Department of Internal Medicines, Invasive Cardiology Division, University of Szeged, 6720 Szeged, Hungary;
| | - Aditya M. Sharma
- Division of Cardiovascular Medicine, University of Virginia, Charlottesville, VA 22903, USA;
| | - Sanjay Saxena
- Department of CSE, International Institute of Information Technology, Bhubaneswar 751003, India;
| | - Gavino Faa
- Department of Pathology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| | - John R. Laird
- Cardiology Department, St. Helena Hospital, St. Helena, CA 94574, USA;
| | - Amer M. Johri
- Department of Medicine, Division of Cardiology, Queen’s University, Kingston, ON K7L 3N6, Canada;
| | - Manudeep K. Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA;
| | - Kosmas I. Paraskevas
- Department of Vascular Surgery, Central Clinic of Athens, N. Iraklio, 14122 Athens, Greece;
| | - Luca Saba
- Department of Radiology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| |
Collapse
|
6
|
Wadhawan S, Maini R. A Systematic Review on Prediction Techniques for Cardiac Disease. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH 2022. [DOI: 10.4018/ijitsa.290001] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Mortality rate can be lowered with early prediction of cardiac diseases, which is one of the major issue in healthcare industry. In comparison of traditional methods, intelligent systems have potential to predict these diseases accurately at early stage even with complex data. Various intelligent DSS are presented by researchers for predicting this disease. To study the trends of these intelligent systems, to find the effective techniques for predicting cardiac disease and to find the future directions are the objective of this study. Therefore this paper presents a systematic review on state-of-art techniques based on ML, NN and FL. For analysis, we follow PRISMA statement and considered the studies presented from 2010 to 2020 from different databases. Analysis concluded that ML based techniques are broadly used for feature selection and classification and have the potential for the prediction of cardiac diseases. The future directions are to evaluate the rarely used prediction techniques and finding the way of improving them for model generalization with better prediction accuracy.
Collapse
Affiliation(s)
- Savita Wadhawan
- Department of CSE, Punjabi University, Patiala, India & MMICTBM, MM(DU), Mullana, Ambala, India
| | - Raman Maini
- Department of CSE, Punjabi University, Patiala, India
| |
Collapse
|
7
|
Hatmal MM, Alshaer W, Mahmoud IS, Al-Hatamleh MAI, Al-Ameer HJ, Abuyaman O, Zihlif M, Mohamud R, Darras M, Al Shhab M, Abu-Raideh R, Ismail H, Al-Hamadi A, Abdelhay A. Investigating the association of CD36 gene polymorphisms (rs1761667 and rs1527483) with T2DM and dyslipidemia: Statistical analysis, machine learning based prediction, and meta-analysis. PLoS One 2021; 16:e0257857. [PMID: 34648514 PMCID: PMC8516279 DOI: 10.1371/journal.pone.0257857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 09/11/2021] [Indexed: 12/15/2022] Open
Abstract
CD36 (cluster of differentiation 36) is a membrane protein involved in lipid metabolism and has been linked to pathological conditions associated with metabolic disorders, such as diabetes and dyslipidemia. A case-control study was conducted and included 177 patients with type-2 diabetes mellitus (T2DM) and 173 control subjects to study the involvement of CD36 gene rs1761667 (G>A) and rs1527483 (C>T) polymorphisms in the pathogenesis of T2DM and dyslipidemia among Jordanian population. Lipid profile, blood sugar, gender and age were measured and recorded. Also, genotyping analysis for both polymorphisms was performed. Following statistical analysis, 10 different neural networks and machine learning (ML) tools were used to predict subjects with diabetes or dyslipidemia. Towards further understanding of the role of CD36 protein and gene in T2DM and dyslipidemia, a protein-protein interaction network and meta-analysis were carried out. For both polymorphisms, the genotypic frequencies were not significantly different between the two groups (p > 0.05). On the other hand, some ML tools like multilayer perceptron gave high prediction accuracy (≥ 0.75) and Cohen's kappa (κ) (≥ 0.5). Interestingly, in K-star tool, the accuracy and Cohen's κ values were enhanced by including the genotyping results as inputs (0.73 and 0.46, respectively, compared to 0.67 and 0.34 without including them). This study confirmed, for the first time, that there is no association between CD36 polymorphisms and T2DM or dyslipidemia among Jordanian population. Prediction of T2DM and dyslipidemia, using these extensive ML tools and based on such input data, is a promising approach for developing diagnostic and prognostic prediction models for a wide spectrum of diseases, especially based on large medical databases.
Collapse
Affiliation(s)
- Ma’mon M. Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
- * E-mail:
| | - Walhan Alshaer
- Cell Therapy Centre, The University of Jordan, Amman, Jordan
| | - Ismail S. Mahmoud
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Mohammad A. I. Al-Hatamleh
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia
| | - Hamzeh J. Al-Ameer
- Department of Biology and Biotechnology, American University of Madaba, Madaba, Jordan
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| | - Omar Abuyaman
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Malek Zihlif
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| | - Rohimah Mohamud
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia
| | - Mais Darras
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Mohammad Al Shhab
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| | - Rand Abu-Raideh
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Hilweh Ismail
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Ali Al-Hamadi
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Ali Abdelhay
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| |
Collapse
|
8
|
Shanbehzadeh M, Kazemi-Arpanahi H, Orooji A, Mobarak S, Jelvay S. Performance evaluation of selected machine learning algorithms for COVID-19 prediction using routine clinical data: With versus Without CT scan features. JOURNAL OF EDUCATION AND HEALTH PROMOTION 2021; 10:285. [PMID: 34667785 PMCID: PMC8459865 DOI: 10.4103/jehp.jehp_1424_20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Accepted: 11/19/2020] [Indexed: 06/13/2023]
Abstract
BACKGROUND Given coronavirus disease (COVID-19's) unknown nature, diagnosis, and treatment is very complex up to the present time. Thus, it is essential to have a framework for an early prediction of the disease. In this regard, machines learning (ML) could be crucial to extract concealed patterns from mining of huge raw datasets then it establishes high-quality predictive models. At this juncture, we aimed to apply different ML techniques to develop clinical predictive models and select the best performance of them. MATERIALS AND METHODS The dataset of Ayatollah Talleghani hospital, COVID-19 focal center affiliated to Abadan University of Medical Sciences have been taken into consideration. The dataset used in this study consists of 501 case records with two classes (COVID-19 and non COVID-19) and 32 columns for the diagnostic features. ML algorithms such as Naïve Bayesian, Bayesian Net, random forest (RF), multilayer perceptron, K-star, C4.5, and support vector machine were developed. Then, the recital of selected ML models was assessed by the comparison of some performance indices such as accuracy, sensitivity, specificity, precision, F-score, and receiver operating characteristic (ROC). RESULTS The experimental results indicate that RF algorithm with the accuracy of 92.42%, specificity of 75.70%, precision of 92.30%, sensitivity of 92.40%, F-measure of 92.00%, and ROC of 97.15% has the best capability for COVID-19 diagnosis and screening. CONCLUSION The empirical results reveal that RF model yielded higher performance as compared to other six classification models. It is promising to the implementation of RF model in the health-care settings to increase the accuracy and speed of disease diagnosis for primary prevention, screening, surveillance, and early treatment.
Collapse
Affiliation(s)
- Mostafa Shanbehzadeh
- Assistant Professor of Health Information Management, Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Hadi Kazemi-Arpanahi
- Assistant Professor of Health Information Management, Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran
- Assistant Professor of Health Information Management, Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran
| | - Azam Orooji
- Assistant Professor of Medical Informatics, School of Medicine, North Khorasan University of Medical Science, North Khorasan, Iran
| | - Sara Mobarak
- Assistant Professor of Infectious Diseases, School of Medicine, Abadan University of Medical Sciences, Abadan, Iran
| | - Saeed Jelvay
- MSc of Health Information Technology, Department of Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran
| |
Collapse
|
9
|
Hatmal MM, Al-Hatamleh MAI, Olaimat AN, Hatmal M, Alhaj-Qasem DM, Olaimat TM, Mohamud R. Side Effects and Perceptions Following COVID-19 Vaccination in Jordan: A Randomized, Cross-Sectional Study Implementing Machine Learning for Predicting Severity of Side Effects. Vaccines (Basel) 2021; 9:vaccines9060556. [PMID: 34073382 PMCID: PMC8229440 DOI: 10.3390/vaccines9060556] [Citation(s) in RCA: 108] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Revised: 05/20/2021] [Accepted: 05/20/2021] [Indexed: 02/06/2023] Open
Abstract
Background: Since the coronavirus disease 2019 (COVID-19) was declared a pandemic, there was no doubt that vaccination is the ideal protocol to tackle it. Within a year, a few COVID-19 vaccines have been developed and authorized. This unparalleled initiative in developing vaccines created many uncertainties looming around the efficacy and safety of these vaccines. This study aimed to assess the side effects and perceptions following COVID-19 vaccination in Jordan. Methods: A cross-sectional study was conducted by distributing an online survey targeted toward Jordan inhabitants who received any COVID-19 vaccines. Data were statistically analyzed and certain machine learning (ML) tools, including multilayer perceptron (MLP), eXtreme gradient boosting (XGBoost), random forest (RF), and K-star were used to predict the severity of side effects. Results: A total of 2213 participants were involved in the study after receiving Sinopharm, AstraZeneca, Pfizer-BioNTech, and other vaccines (38.2%, 31%, 27.3%, and 3.5%, respectively). Generally, most of the post-vaccination side effects were common and non-life-threatening (e.g., fatigue, chills, dizziness, fever, headache, joint pain, and myalgia). Only 10% of participants suffered from severe side effects; while 39% and 21% of participants had moderate and mild side effects, respectively. Despite the substantial variations between these vaccines in the presence and severity of side effects, the statistical analysis indicated that these vaccines might provide the same protection against COVID-19 infection. Finally, around 52.9% of participants suffered before vaccination from vaccine hesitancy and anxiety; while after vaccination, 95.5% of participants have advised others to get vaccinated, 80% felt more reassured, and 67% believed that COVID-19 vaccines are safe in the long term. Furthermore, based on the type of vaccine, demographic data, and side effects, the RF, XGBoost, and MLP gave both high accuracies (0.80, 0.79, and 0.70, respectively) and Cohen’s kappa values (0.71, 0.70, and 0.56, respectively). Conclusions: The present study confirmed that the authorized COVID-19 vaccines are safe and getting vaccinated makes people more reassured. Most of the post-vaccination side effects are mild to moderate, which are signs that body’s immune system is building protection. ML can also be used to predict the severity of side effects based on the input data; predicted severe cases may require more medical attention or even hospitalization.
Collapse
Affiliation(s)
- Ma’mon M. Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa 13133, Jordan
- Correspondence: (M.M.H.); (R.M.)
| | - Mohammad A. I. Al-Hatamleh
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kelantan 16150, Malaysia;
| | - Amin N. Olaimat
- Department of Clinical Nutrition and Dietetics, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa 13133, Jordan;
| | | | | | | | - Rohimah Mohamud
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kelantan 16150, Malaysia;
- Correspondence: (M.M.H.); (R.M.)
| |
Collapse
|
10
|
Bayesian Network as a Decision Tool for Predicting ALS Disease. Brain Sci 2021; 11:brainsci11020150. [PMID: 33498784 PMCID: PMC7912628 DOI: 10.3390/brainsci11020150] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 01/09/2021] [Accepted: 01/20/2021] [Indexed: 12/14/2022] Open
Abstract
Clinical diagnosis of amyotrophic lateral sclerosis (ALS) is difficult in the early period. But blood tests are less time consuming and low cost methods compared to other methods for the diagnosis. The ALS researchers have been used machine learning methods to predict the genetic architecture of disease. In this study we take advantages of Bayesian networks and machine learning methods to predict the ALS patients with blood plasma protein level and independent personal features. According to the comparison results, Bayesian Networks produced best results with accuracy (0.887), area under the curve (AUC) (0.970) and other comparison metrics. We confirmed that sex and age are effective variables on the ALS. In addition, we found that the probability of onset involvement in the ALS patients is very high. Also, a person’s other chronic or neurological diseases are associated with the ALS disease. Finally, we confirmed that the Parkin level may also have an effect on the ALS disease. While this protein is at very low levels in Parkinson’s patients, it is higher in the ALS patients than all control groups.
Collapse
|
11
|
Sabahi F. Bimodal fuzzy analytic hierarchy process (BFAHP) for coronary heart disease risk assessment. J Biomed Inform 2018; 83:204-216. [PMID: 29625186 DOI: 10.1016/j.jbi.2018.03.016] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Revised: 03/13/2018] [Accepted: 03/31/2018] [Indexed: 01/24/2023]
Abstract
Rooted deeply in medical multiple criteria decision-making (MCDM), risk assessment is very important especially when applied to the risk of being affected by deadly diseases such as coronary heart disease (CHD). CHD risk assessment is a stochastic, uncertain, and highly dynamic process influenced by various known and unknown variables. In recent years, there has been a great interest in fuzzy analytic hierarchy process (FAHP), a popular methodology for dealing with uncertainty in MCDM. This paper proposes a new FAHP, bimodal fuzzy analytic hierarchy process (BFAHP) that augments two aspects of knowledge, probability and validity, to fuzzy numbers to better deal with uncertainty. In BFAHP, fuzzy validity is computed by aggregating the validities of relevant risk factors based on expert knowledge and collective intelligence. By considering both soft and statistical data, we compute the fuzzy probability of risk factors using the Bayesian formulation. In BFAHP approach, these fuzzy validities and fuzzy probabilities are used to construct a reciprocal comparison matrix. We then aggregate fuzzy probabilities and fuzzy validities in a pairwise manner for each risk factor and each alternative. BFAHP decides about being affected and not being affected by ranking of high and low risks. For evaluation, the proposed approach is applied to the risk of being affected by CHD using a real dataset of 152 patients of Iranian hospitals. Simulation results confirm that adding validity in a fuzzy manner can accrue more confidence of results and clinically useful especially in the face of incomplete information when compared with actual results. Applying the proposed BFAHP on CHD risk assessment of the dataset, it yields high accuracy rate above 85% for correct prediction. In addition, this paper recognizes that the risk factors of diastolic blood pressure in men and high-density lipoprotein in women are more important in CHD than other risk factors.
Collapse
Affiliation(s)
- Farnaz Sabahi
- Soft Computing Laboratory, Faculty of Electrical and Computer Engineering, Urmia University, Urmia, Iran.
| |
Collapse
|
12
|
Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues. Sci Rep 2016; 6:39655. [PMID: 28000796 PMCID: PMC5175133 DOI: 10.1038/srep39655] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 11/24/2016] [Indexed: 12/23/2022] Open
Abstract
The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPDCs. First, we examined the Pfam numbers of the known DEPDCs and used the longest sequences for each Pfam to construct a phylogenetic tree. Subsequently, we extracted 188-dimensional (188D) and 20D features of DEPDCs and non-DEPDCs and classified them with random forest classifier. We also mined the motifs of human DEPDCs to find the related domains. Finally, we designed experimental verification methods of human DEPDC expression at the mRNA level in hepatocellular carcinoma (HCC) and adjacent normal tissues. The phylogenetic analysis showed that the DEPDCs superfamily can be divided into three clusters. Moreover, the 188D and 20D features can both be used to effectively distinguish the two protein types. Motif analysis revealed that the DEP and RhoGAP domain was common in human DEPDCs, human HCC and the adjacent tissues that widely expressed DEPDCs. However, their regulation was not identical. In conclusion, we successfully constructed a binary classifier for DEPDCs and experimentally verified their expression in human HCC tissues.
Collapse
|
13
|
Wiharto W, Kusnanto H, Herianto H. Interpretation of Clinical Data Based on C4.5 Algorithm for the Diagnosis of Coronary Heart Disease. Healthc Inform Res 2016; 22:186-95. [PMID: 27525160 PMCID: PMC4981579 DOI: 10.4258/hir.2016.22.3.186] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Revised: 06/20/2016] [Accepted: 07/13/2016] [Indexed: 11/25/2022] Open
Abstract
Objectives The interpretation of clinical data for the diagnosis of coronary heart disease can be done using algorithms in data mining. Most clinical data interpretation systems for diagnosis developed using data mining algorithms with a black-box approach cannot recognize examination attribute relationships with the incidence of coronary heart disease. Methods This study proposes a system to interpretation clinical examination results for the diagnosis of coronary heart disease based the decision tree algorithm. This system comprises several stages. First, oversampling is carried out by a combination of the synthetic minority oversampling technique (SMOTE), feature selection, and the C4.5 classification algorithm. System testing is done using k-fold cross-validation. The performance parameters are sensitivity, specificity, positive prediction value (PPV), negative prediction value (NPV) and the area under the curve (AUC). Results The results showed that the performance of the system has a sensitivity of 74.7%, a specificity of 93.7%, a PPV of 74.2%, an NPV of 93.7%, and an AUC of 84.2%. Conclusions This study demonstrated that, by using C4.5 algorithms, data can be interpreted in the form of a decision tree, to aid the understanding of the clinician. In addition, the proposed system can provide better performance by category.
Collapse
Affiliation(s)
- Wiharto Wiharto
- Department of Informatic, Sebelas Maret University, Surakarta, Indonesia.; Department of Biomedical Engineering, Gadjah Mada University, Yogyakarta, Indonesia
| | - Hari Kusnanto
- Department of Biomedical Engineering, Gadjah Mada University, Yogyakarta, Indonesia.; Department of Medicine, Gadjah Mada University, Yogyakarta, Indonesia
| | - Herianto Herianto
- Department of Biomedical Engineering, Gadjah Mada University, Yogyakarta, Indonesia.; Department of Mechanical εt Industrial Engineering, Gadjah Mada University, Yogyakarta, Indonesia
| |
Collapse
|
14
|
Liao Z, Ju Y, Zou Q. Prediction of G Protein-Coupled Receptors with SVM-Prot Features and Random Forest. SCIENTIFICA 2016; 2016:8309253. [PMID: 27529053 PMCID: PMC4978840 DOI: 10.1155/2016/8309253] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 06/26/2016] [Accepted: 06/30/2016] [Indexed: 06/06/2023]
Abstract
G protein-coupled receptors (GPCRs) are the largest receptor superfamily. In this paper, we try to employ physical-chemical properties, which come from SVM-Prot, to represent GPCR. Random Forest was utilized as classifier for distinguishing them from other protein sequences. MEME suite was used to detect the most significant 10 conserved motifs of human GPCRs. In the testing datasets, the average accuracy was 91.61%, and the average AUC was 0.9282. MEME discovery analysis showed that many motifs aggregated in the seven hydrophobic helices transmembrane regions adapt to the characteristic of GPCRs. All of the above indicate that our machine-learning method can successfully distinguish GPCRs from non-GPCRs.
Collapse
Affiliation(s)
- Zhijun Liao
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350108, China
- School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
| | - Ying Ju
- School of Information Science and Technology, Xiamen University, Xiamen, Fujian 361005, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, China
| |
Collapse
|
15
|
Miranda E, Irwansyah E, Amelga AY, Maribondang MM, Salim M. Detection of Cardiovascular Disease Risk's Level for Adults Using Naive Bayes Classifier. Healthc Inform Res 2016; 22:196-205. [PMID: 27525161 PMCID: PMC4981580 DOI: 10.4258/hir.2016.22.3.196] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Revised: 06/19/2016] [Accepted: 06/30/2016] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVES The number of deaths caused by cardiovascular disease and stroke is predicted to reach 23.3 million in 2030. As a contribution to support prevention of this phenomenon, this paper proposes a mining model using a naïve Bayes classifier that could detect cardiovascular disease and identify its risk level for adults. METHODS The process of designing the method began by identifying the knowledge related to the cardiovascular disease profile and the level of cardiovascular disease risk factors for adults based on the medical record, and designing a mining technique model using a naïve Bayes classifier. Evaluation of this research employed two methods: accuracy, sensitivity, and specificity calculation as well as an evaluation session with cardiologists and internists. The characteristics of cardiovascular disease are identified by its primary risk factors. Those factors are diabetes mellitus, the level of lipids in the blood, coronary artery function, and kidney function. Class labels were assigned according to the values of these factors: risk level 1, risk level 2 and risk level 3. RESULTS The evaluation of the classifier performance (accuracy, sensitivity, and specificity) in this research showed that the proposed model predicted the class label of tuples correctly (above 80%). More than eighty percent of respondents (including cardiologists and internists) who participated in the evaluation session agree till strongly agreed that this research followed medical procedures and that the result can support medical analysis related to cardiovascular disease. CONCLUSIONS The research showed that the proposed model achieves good performance for risk level detection of cardiovascular disease.
Collapse
Affiliation(s)
- Eka Miranda
- School of Information System, Bina Nusantara University, Jakarta, Indonesia
| | - Edy Irwansyah
- School of Information System, Bina Nusantara University, Jakarta, Indonesia
| | | | | | - Mulyadi Salim
- School of Information System, Bina Nusantara University, Jakarta, Indonesia
| |
Collapse
|