1
|
Palacios-Ariza MA, Morales-Mendoza E, Murcia J, Arias-Duarte R, Lara-Castellanos G, Cely-Jiménez A, Rincón-Acuña JC, Araúzo-Bravo MJ, McDouall J. Prediction of patient admission and readmission in adults from a Colombian cohort with bipolar disorder using artificial intelligence. Front Psychiatry 2023; 14:1266548. [PMID: 38179255 PMCID: PMC10764573 DOI: 10.3389/fpsyt.2023.1266548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/30/2023] [Indexed: 01/06/2024] Open
Abstract
Introduction Bipolar disorder (BD) is a chronically progressive mental condition, associated with a reduced quality of life and greater disability. Patient admissions are preventable events with a considerable impact on global functioning and social adjustment. While machine learning (ML) approaches have proven prediction ability in other diseases, little is known about their utility to predict patient admissions in this pathology. Aim To develop prediction models for hospital admission/readmission within 5 years of diagnosis in patients with BD using ML techniques. Methods The study utilized data from patients diagnosed with BD in a major healthcare organization in Colombia. Candidate predictors were selected from Electronic Health Records (EHRs) and included sociodemographic and clinical variables. ML algorithms, including Decision Trees, Random Forests, Logistic Regressions, and Support Vector Machines, were used to predict patient admission or readmission. Survival models, including a penalized Cox Model and Random Survival Forest, were used to predict time to admission and first readmission. Model performance was evaluated using accuracy, precision, recall, F1 score, area under the receiver operating characteristic curve (AUC) and concordance index. Results The admission dataset included 2,726 BD patients, with 354 admissions, while the readmission dataset included 352 patients, with almost half being readmitted. The best-performing model for predicting admission was the Random Forest, with an accuracy score of 0.951 and an AUC of 0.98. The variables with the greatest predictive power in the Recursive Feature Elimination (RFE) importance analysis were the number of psychiatric emergency visits, the number of outpatient follow-up appointments and age. Survival models showed similar results, with the Random Survival Forest performing best, achieving an AUC of 0.95. However, the prediction models for patient readmission had poorer performance, with the Random Forest model being again the best performer but with an AUC below 0.70. Conclusion ML models, particularly the Random Forest model, outperformed traditional statistical techniques for admission prediction. However, readmission prediction models had poorer performance. This study demonstrates the potential of ML techniques in improving prediction accuracy for BD patient admissions.
Collapse
Affiliation(s)
| | - Esteban Morales-Mendoza
- Fundación Universitaria Sanitas, Gerencia y Gestión Sanitaria Research Group, Instituto de Gerencia y Gestión Sanitaria (IGGS), Bogotá, Colombia
| | - Jossie Murcia
- Fundación Universitaria Sanitas, Gerencia y Gestión Sanitaria Research Group, Instituto de Gerencia y Gestión Sanitaria (IGGS), Bogotá, Colombia
| | - Rafael Arias-Duarte
- Psicopatología y Sociedad Research Group, Facultad de Medicina, Fundación Universitaria Sanitas, Bogotá, Colombia
| | - Germán Lara-Castellanos
- Psicopatología y Sociedad Research Group, Facultad de Medicina, Fundación Universitaria Sanitas, Bogotá, Colombia
| | | | | | - Marcos J. Araúzo-Bravo
- Keralty, Bogotá, Colombia
- Computational Biology and Systems Biomedicine, Biodonostia Health Research Institute, San Sebastián, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
- Department of Cell Biology and Histology, Faculty of Medicine and Nursing, University of Basque Country (UPV/EHU), Leioa, Spain
| | - Jorge McDouall
- Sanitas Crea Research Group, Fundación Universitaria Sanitas, Bogotá, Colombia
| |
Collapse
|
2
|
Boutaib S, Elarbi M, Bechikh S, Coello CAC, Said LB. Uncertainty-wise software anti-patterns detection: A possibilistic evolutionary machine learning approach. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
3
|
PDAUG: a Galaxy based toolset for peptide library analysis, visualization, and machine learning modeling. BMC Bioinformatics 2022; 23:197. [PMID: 35643441 PMCID: PMC9148462 DOI: 10.1186/s12859-022-04727-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 05/11/2022] [Indexed: 11/28/2022] Open
Abstract
Background Computational methods based on initial screening and prediction of peptides for desired functions have proven to be effective alternatives to lengthy and expensive biochemical experimental methods traditionally utilized in peptide research, thus saving time and effort. However, for many researchers, the lack of expertise in utilizing programming libraries, access to computational resources, and flexible pipelines are big hurdles to adopting these advanced methods.
Results To address the above mentioned barriers, we have implemented the peptide design and analysis under Galaxy (PDAUG) package, a Galaxy-based Python powered collection of tools, workflows, and datasets for rapid in-silico peptide library analysis. In contrast to existing methods like standard programming libraries or rigid single-function web-based tools, PDAUG offers an integrated GUI-based toolset, providing flexibility to build and distribute reproducible pipelines and workflows without programming expertise. Finally, we demonstrate the usability of PDAUG in predicting anticancer properties of peptides using four different feature sets and assess the suitability of various ML algorithms. Conclusion PDAUG offers tools for peptide library generation, data visualization, built-in and public database peptide sequence retrieval, peptide feature calculation, and machine learning (ML) modeling. Additionally, this toolset facilitates researchers to combine PDAUG with hundreds of compatible existing Galaxy tools for limitless analytic strategies. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04727-6.
Collapse
|
4
|
A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:3820360. [PMID: 35463255 PMCID: PMC9033325 DOI: 10.1155/2022/3820360] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/12/2022] [Accepted: 03/19/2022] [Indexed: 01/12/2023]
Abstract
An active research area where the experts from the medical field are trying to envisage the problem with more accuracy is diabetes prediction. Surveys conducted by WHO have shown a remarkable increase in the diabetic patients. Diabetes generally remains in dormant mode and it boosts the other diseases if patients are diagnosed with some other disease such as damage to the kidney vessels, problems in retina of the eye, and cardiac problem; if unidentified, it can create metabolic disorders and too many complications in the body. The main objective of our study is to draw a comparative study of different classifiers and feature selection methods to predict the diabetes with greater accuracy. In this paper, we have studied multilayer perceptron, decision trees, K-nearest neighbour, and random forest classifiers and few feature selection techniques were applied on the classifiers to detect the diabetes at an early stage. Raw data is subjected to preprocessing techniques, thus removing outliers and imputing missing values by mean and then in the end hyperparameters optimization. Experiments were conducted on PIMA Indians diabetes dataset using Weka 3.9 and the accuracy achieved for multilayer perceptron is 77.60%, for decision trees is 76.07%, for K-nearest neighbour is 78.58%, and for random forest is 79.8%, which is by far the best accuracy for random forest classifier.
Collapse
|
5
|
Bhardwaj P, Tiwari P, Olejar K, Parr W, Kulasiri D. A machine learning application in wine quality prediction. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2022.100261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
6
|
Sisodia D, Sisodia DS. Gradient boosting learning for fraudulent publisher detection in online advertising. DATA TECHNOLOGIES AND APPLICATIONS 2020. [DOI: 10.1108/dta-04-2020-0093] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeAnalysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click dataset and eight other imbalanced datasets to test the GTB's generalizing behavior, while training and testing were done using 10-fold cross-validation. The performance of GTB was evaluated using average precision, recall and f-measure. The performance of GTB learning was also compared with eleven other state-of-the-art individual and ensemble classification models.
Collapse
|
7
|
A novel possibilistic artificial immune-based classifier for course learning outcome enhancement. Knowl Inf Syst 2020. [DOI: 10.1007/s10115-020-01465-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
8
|
Sagheer A, Zidan M, Abdelsamea MM. A Novel Autonomous Perceptron Model for Pattern Classification Applications. ENTROPY 2019; 21:e21080763. [PMID: 33267477 PMCID: PMC7515292 DOI: 10.3390/e21080763] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 07/30/2019] [Accepted: 07/30/2019] [Indexed: 02/08/2023]
Abstract
Pattern classification represents a challenging problem in machine learning and data science research domains, especially when there is a limited availability of training samples. In recent years, artificial neural network (ANN) algorithms have demonstrated astonishing performance when compared to traditional generative and discriminative classification algorithms. However, due to the complexity of classical ANN architectures, ANNs are sometimes incapable of providing efficient solutions when addressing complex distribution problems. Motivated by the mathematical definition of a quantum bit (qubit), we propose a novel autonomous perceptron model (APM) that can solve the problem of the architecture complexity of traditional ANNs. APM is a nonlinear classification model that has a simple and fixed architecture inspired by the computational superposition power of the qubit. The proposed perceptron is able to construct the activation operators autonomously after a limited number of iterations. Several experiments using various datasets are conducted, where all the empirical results show the superiority of the proposed model as a classifier in terms of accuracy and computational time when it is compared with baseline classification models.
Collapse
Affiliation(s)
- Alaa Sagheer
- College of Computer Science and Information Technology, King Faisal University, AlAhsa 31982, Saudi Arabia
- Center for Artificial Intelligence and Robotics (CAIRO), Faculty of Science, Aswan University, Aswan 81528, Egypt
| | - Mohammed Zidan
- University of Science and Technology, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza 12578, Egypt
- Correspondence:
| | - Mohammed M. Abdelsamea
- Department of Mathematics, Faculty of Science, Assiut University, Assiut 71515, Egypt
- School of Computer Science, University of Nottingham, Nottingham NG8 1BB, UK
| |
Collapse
|
9
|
Game PS, Vaze V, Emmanuel M. Optimized Decision tree rules using divergence based grey wolf optimization for big data classification in health care. EVOLUTIONARY INTELLIGENCE 2019. [DOI: 10.1007/s12065-019-00267-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Application of Game Theory against Nature in the Assessment of Technical Solutions Used in River Regulation in the Context of Aquatic Plant Protection. SUSTAINABILITY 2019. [DOI: 10.3390/su11051260] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The anthropogenic transformation of riverbeds causes a violation of the dynamic equilibrium of the river and its environment, threatening the ecological safety of aquatic ecosystems and dependent waters. However, the differing results of these transformations are dependent on many factors and it is difficult to determine them precisely before the works start. The designers and contractors of these works are dealing with the riverbed, which in terms of hydromorphological and biological features is variable, unique, and strongly diverse. Thus, decisions are followed by an unknown result concerning changes in the riverbed ecosystems. The aim of this study is to determine the suitability of game theory as a tool supporting decision-making in the design of regulatory works including ecological aspects, as well as an indication of a regulatory works model that would meet the expectations of water users while corresponding to environmentally friendly riverbed regulation. The analysis was made on the basis of observed changes in the number of species in aquatic plant vascular communities—one of the most important elements of a riverbed ecosystem. Using game theory, it is possible to create an effective tool for the design of regulatory works and decision-making process.
Collapse
|
11
|
Baati K, Hamdani TM, Alimi AM, Abraham A. A new classifier for categorical data based on a possibilistic estimation and a novel generalized minimum-based algorithm. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2017. [DOI: 10.3233/jifs-15372] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Karim Baati
- REGIM-Lab.: REsearch Groups on Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENIS), Sfax, Tunisia
- Esprit School of Engineering, Tunis, Tunisia
| | - Tarek M. Hamdani
- REGIM-Lab.: REsearch Groups on Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENIS), Sfax, Tunisia
- Taibah University, College Of Science and arts at Al-Ula, Al-Madinah al-Munawwarah, KSA
| | - Adel M. Alimi
- REGIM-Lab.: REsearch Groups on Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENIS), Sfax, Tunisia
| | - Ajith Abraham
- Machines Intelligence Research Labs (MIR Labs), Scientific Network for Innovation and Research Excellence, Auburn, WA, USA
| |
Collapse
|
12
|
Amini P, Maroufizadeh S, Samani RO, Hamidi O, Sepidarkish M. Prevalence and Determinants of Preterm Birth in Tehran, Iran: A Comparison between Logistic Regression and Decision Tree Methods. Osong Public Health Res Perspect 2017; 8:195-200. [PMID: 28781942 PMCID: PMC5525564 DOI: 10.24171/j.phrp.2017.8.3.06] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 05/17/2017] [Indexed: 01/12/2023] Open
Abstract
Objectives Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. Methods This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6–21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. Results The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB (p < 0.05). Conclusion Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.
Collapse
Affiliation(s)
- Payam Amini
- Department of Epidemiology and Reproductive Health, Reproductive Epidemiology Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| | - Saman Maroufizadeh
- Department of Epidemiology and Reproductive Health, Reproductive Epidemiology Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| | - Reza Omani Samani
- Department of Epidemiology and Reproductive Health, Reproductive Epidemiology Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| | - Omid Hamidi
- Department of Science, Hamadan University of Technology, Hamadan, Iran
| | - Mahdi Sepidarkish
- Department of Epidemiology and Reproductive Health, Reproductive Epidemiology Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| |
Collapse
|
13
|
|
14
|
Lertworaprachaya Y, Yang Y, John R. Interval-valued fuzzy decision trees with optimal neighbourhood perimeter. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2014.08.060] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
15
|
|
16
|
|
17
|
|
18
|
Classifier Ensemble for Uncertain Data Stream Classification. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 2010. [DOI: 10.1007/978-3-642-13657-3_52] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|