1
|
He Y, Huang R, Zhang R, He F, Han L, Han W. PredCoffee: A binary classification approach specifically for coffee odor. iScience 2024; 27:110041. [PMID: 38868178 PMCID: PMC11167484 DOI: 10.1016/j.isci.2024.110041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/26/2024] [Accepted: 05/16/2024] [Indexed: 06/14/2024] Open
Abstract
Compared to traditional methods, using machine learning to assess or predict the odor of molecules can save costs in various aspects. Our research aims to collect molecules with coffee odor and summarize the regularity of these molecules, ultimately creating a binary classifier that can determine whether a molecule has a coffee odor. In this study, a total of 371 coffee-odor molecules and 9,700 non-coffee-odor molecules were collected. The Knowledge-guided Pre-training of Graph Transformer (KPGT), support vector machine (SVM), random forest (RF), multi-layer perceptron (MLP), and message-passing neural networks (MPNN) were used to train the data. The model with the best performance was selected as the basis of the predictor. The prediction accuracy value of the KPGT model exceeded 0.84 and the predictor has been deployed as a webserver PredCoffee.
Collapse
Affiliation(s)
- Yi He
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun 130012, China
| | - Ruirui Huang
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun 130012, China
| | - Ruoyu Zhang
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun 130012, China
| | - Fei He
- Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Lu Han
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun 130012, China
| | - Weiwei Han
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun 130012, China
| |
Collapse
|
2
|
Sufian MA, Hamzi W, Zaman S, Alsadder L, Hamzi B, Varadarajan J, Azad MAK. Enhancing Clinical Validation for Early Cardiovascular Disease Prediction through Simulation, AI, and Web Technology. Diagnostics (Basel) 2024; 14:1308. [PMID: 38928723 PMCID: PMC11202579 DOI: 10.3390/diagnostics14121308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/10/2024] [Accepted: 05/21/2024] [Indexed: 06/28/2024] Open
Abstract
Cardiovascular diseases (CVDs) remain a major global health challenge and a leading cause of mortality, highlighting the need for improved predictive models. We introduce an innovative agent-based dynamic simulation technique that enhances our AI models' capacity to predict CVD progression. This method simulates individual patient responses to various cardiovascular risk factors, improving prediction accuracy and detail. Also, by incorporating an ensemble learning model and interface of web application in the context of CVD prediction, we developed an AI dashboard-based model to enhance the accuracy of disease prediction and provide a user-friendly app. The performance of traditional algorithms was notable, with Ensemble learning and XGBoost achieving accuracies of 91% and 95%, respectively. A significant aspect of our research was the integration of these models into a streamlit-based interface, enhancing user accessibility and experience. The streamlit application achieved a predictive accuracy of 97%, demonstrating the efficacy of combining advanced AI techniques with user-centered web applications in medical prediction scenarios. This 97% confidence level was evaluated by Brier score and calibration curve. The design of the streamlit application facilitates seamless interaction between complex ML models and end-users, including clinicians and patients, supporting its use in real-time clinical settings. While the study offers new insights into AI-driven CVD prediction, we acknowledge limitations such as the dataset size. In our research, we have successfully validated our predictive proposed methodology against an external clinical setting, demonstrating its robustness and accuracy in a real-world fixture. The validation process confirmed the model's efficacy in the early detection of CVDs, reinforcing its potential for integration into clinical workflows to aid in proactive patient care and management. Future research directions include expanding the dataset, exploring additional algorithms, and conducting clinical trials to validate our findings. This research provides a valuable foundation for future studies, aiming to make significant strides against CVDs.
Collapse
Affiliation(s)
- Md Abu Sufian
- IVR Low-Carbon Research Institute, Chang’an University, Xi’an 710018, China;
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Wahiba Hamzi
- Laboratoire de Biotechnologie Santé et Environnement, Department of Biology, University of Blida, Blida 09000, Algeria;
| | - Sadia Zaman
- Department of Physiology, Queen Mary University, London E1 4NS, UK; (S.Z.); (L.A.)
| | - Lujain Alsadder
- Department of Physiology, Queen Mary University, London E1 4NS, UK; (S.Z.); (L.A.)
| | - Boumediene Hamzi
- Department of Computing and Mathematical Sciences, California Institute of Technology, Caltech, CA 91125, USA;
- The Alan Turing Institute, London NW1 2DB, UK
- Department of Mathematics, Gulf University for Science and Technology (GUST), Mubarak Al-Abdullah 32093, Kuwait
| | - Jayasree Varadarajan
- Centre for Digital Innovation, Manchester Metropolitan University, Manchester M15 6BH, UK;
| | - Md Abul Kalam Azad
- Department of Medicine, Rangpur Medical College and Hospital, Rangpur 5400, Bangladesh
| |
Collapse
|
3
|
Zhao R, Wang G, Li F, Wang J, Zhang Y, Li D, Liu S, Li J, Song J, Wei F, Wang C. Developing Machine Learning-Based Predictive Models for Hallux Valgus Recurrence Based on Measurements From Radiographs. Foot Ankle Int 2024:10711007241256648. [PMID: 38872342 DOI: 10.1177/10711007241256648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
BACKGROUND Machine learning (ML) is increasingly used to predict the prognosis of numerous diseases. This retrospective analysis aimed to develop a prediction model using ML algorithms and to identify predictors associated with the recurrence of hallux valgus (HV) following surgery. METHODS A total of 198 symptomatic feet that underwent chevron osteotomy combined with a distal soft tissue procedure were enrolled and analyzed from 2 independent medical centers. The feet were grouped according to nonrecurrence or recurrence based on 1-year follow-up outcomes. Preoperative weightbearing radiographs and immediate postoperative nonweightbearing radiographs were obtained for each HV foot. Radiographic measurements (eg, HV angle and intermetatarsal angle) were acquired and used for ML model training. A total of 9 commonly used ML models were trained on the data obtained from one institute (108 feet), and tested on the other data set from another independent institute (90 feet) for external validation. Optimal feature sets for each model were identified based on a 2000-resample bootstrap-based internal validation via an exhaustive search. The performance of each model was then tested on the external validation set. The area under the curve (AUC), classification accuracy, sensitivity, and specificity of each model were calculated to evaluate the performance of each model. RESULTS The support vector machine (SVM) model showed the highest predictive accuracy compared to other methods, with an AUC of 0.88 and an accuracy of 75.6%. Preoperative hallux valgus angle, tibial sesamoid position, postoperative intermetatarsal angle, and postoperative tibial sesamoid position were identified as the most selected features by several ML models. CONCLUSION ML classifiers such as SVM could predict the recurrence of HV (an HVA >20 degrees) at a 1-year follow-up while identifying associated predictors in a multivariate manner. This study holds the potential for foot and ankle surgeons to effectively identify individuals at higher risk of HV recurrence postsurgery.
Collapse
Affiliation(s)
- Rui Zhao
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Guobin Wang
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Fengtan Li
- Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Jinchan Wang
- Department of Dermatology, Tianjin Medical University General Hospital, Tianjin, China
| | - Yuan Zhang
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Dong Li
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Shen Liu
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Jie Li
- Graduate School, Tianjin Medical University, Tianjin, China
| | - Jiajun Song
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Fangyuan Wei
- Department of Hand and Foot Surgery, Beijing University of Chinese Medicine Third Affiliated Hospital, Beijing, China
- Engineering Research Center of Chinese Orthopaedic and Sports Rehabilitation Artificial Intelligent, Ministry of Education, Beijing, China
| | - Chenguang Wang
- Department of Orthopedic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| |
Collapse
|
4
|
Zhong Y, Wu Q, Cai L, Chen Y, Shen Q. CDC167 exhibits potential as a biomarker for airway inflammation in asthma. Mamm Genome 2024; 35:135-148. [PMID: 38580753 PMCID: PMC11130062 DOI: 10.1007/s00335-024-10037-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 03/01/2024] [Indexed: 04/07/2024]
Abstract
Current asthma treatments have been discovered to decrease the risk of disease progression. Herein, we aimed to characterize novel potential therapeutic targets for asthma. Differentially expressed genes (DEGs) for GSE64913 and GSE137268 datasets were characterized. Weighted correlation network analysis (WGCNA) was used to identify trait-related module genes within the GSE67472 dataset. The intersection of the module genes of interest, as well as the DEGs, comprised the key module genes that underwent additional candidate gene screening using machine learning. In addition, a bioinformatics-based approach was used to analyze the relative expression levels, diagnostic values, and reverently enriched pathways of the screened candidate genes. Furthermore, the candidate genes were silenced in asthmatic mice, and the inflammation and lung injury in the mice were validated. A total of 1710 DEGs were characterized in GSE64913 and GSE137268 for asthma patients. WGCNA identified 2367 asthma module genes, of which 285 overlapped with 1710 DEGs. Four candidate genes, CDC167, POSTN, SEC14L1, and SERPINB2, were validated using the intersection genes of three machine learning algorithms, including Least Absolute Shrinkage and Selection Operator, Random Forest, and Support Vector Machine. All the candidate genes were significantly upregulated in asthma patients and demonstrated diagnostic utility for asthma. Furthermore, silencing CDC167 reduced the levels of inflammatory cytokines significantly and alleviated lung injury in ovalbumin (OVA)-induced asthmatic mice. Our study demonstrated that CDC167 exhibits potential as diagnostic markers and therapeutic targets for asthma patients.
Collapse
Affiliation(s)
- Yukai Zhong
- Department of Pediatrics, Kongjiang Hospital of Shanghai Yangpu District, Shanghai, 200093, China
| | - Qiong Wu
- Department of Respiratory, Kongjiang Hospital of Shanghai Yangpu District, No. 480 Shuang Yang Road, Yangpu District, Shanghai, 200093, China
| | - Li Cai
- Department of Colorectal Surgery, Kongjiang Hospital of Shanghai Yangpu District, Shanghai, 200093, China
| | - Yuanjing Chen
- Department of Respiratory, Kongjiang Hospital of Shanghai Yangpu District, No. 480 Shuang Yang Road, Yangpu District, Shanghai, 200093, China.
| | - Qi Shen
- Department of Geriatric Medicine, Tongji University Affiliated Yangpu Hospital, No. 450 Teng Yue Road, Yangpu District, Shanghai, 200090, China.
| |
Collapse
|
5
|
Al-Alshaikh HA, P P, Poonia RC, Saudagar AKJ, Yadav M, AlSagri HS, AlSanad AA. Comprehensive evaluation and performance analysis of machine learning in heart disease prediction. Sci Rep 2024; 14:7819. [PMID: 38570582 PMCID: PMC10991287 DOI: 10.1038/s41598-024-58489-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 03/29/2024] [Indexed: 04/05/2024] Open
Abstract
Heart disease is a leading cause of mortality on a global scale. Accurately predicting cardiovascular disease poses a significant challenge within clinical data analysis. The present study introduces a prediction model that utilizes various combinations of information and employs multiple established classification approaches. The proposed technique combines the genetic algorithm (GA) and the recursive feature elimination method (RFEM) to select relevant features, thus enhancing the model's robustness. Techniques like the under sampling clustering oversampling method (USCOM) address the issue of data imbalance, thereby improving the model's predictive capabilities. The classification challenge employs a multilayer deep convolutional neural network (MLDCNN), trained using the adaptive elephant herd optimization method (AEHOM). The proposed machine learning-based heart disease prediction method (ML-HDPM) demonstrates outstanding performance across various crucial evaluation parameters, as indicated by its comprehensive assessment. During the training process, the ML-HDPM model exhibits a high level of performance, achieving an accuracy rate of 95.5% and a precision rate of 94.8%. The system's sensitivity (recall) performs with a high accuracy rate of 96.2%, while the F-score highlights its well-balanced performance, measuring 91.5%. It is worth noting that the specificity of ML-HDPM is recorded at a remarkable 89.7%. The findings underscore the potential of ML-HDPM to transform the prediction of heart disease and aid healthcare practitioners in providing precise diagnoses, exerting a substantial influence on patient care outcomes.
Collapse
Affiliation(s)
- Halah A Al-Alshaikh
- Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), 11432, Riyadh, Saudi Arabia
| | - Prabu P
- Department of Computer Science, CHRIST University, Bangalore, 560029, India
| | | | - Abdul Khader Jilani Saudagar
- Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), 11432, Riyadh, Saudi Arabia
| | - Manoj Yadav
- Department of Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar, India
| | - Hatoon S AlSagri
- Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), 11432, Riyadh, Saudi Arabia
| | - Abeer A AlSanad
- Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), 11432, Riyadh, Saudi Arabia
| |
Collapse
|
6
|
Fathima AJ, Fasla MMN. A comprehensive review on heart disease prognostication using different artificial intelligence algorithms. Comput Methods Biomech Biomed Engin 2024:1-18. [PMID: 38424704 DOI: 10.1080/10255842.2024.2319706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 02/12/2024] [Indexed: 03/02/2024]
Abstract
Prediction of heart diseases on time is significant in order to preserve life. Many conventional methods have taken efforts on earlier prediction but faced with challenges of higher prediction cost, extended time for computation and complexities with larger volume of data which reduced prediction accuracy. In order to overcome such pitfalls, AI (Artificial Intelligence) technology has been evolved in diagnosing heart diseases through deployment of several ML (Machine Learning) and DL (Deep Learning) algorithms. It improves detection by influencing with its capacity of learning from the massive data containing age, obesity, hypertension and other risk factors of patients and extract it accordingly to differentiate on the circumstances. Moreover, storage of larger data with AI greatly assists in analysing the occurrence of the disease from past historical data. Hence, this paper intends to provide a review on different AI based algorithms used in the heart disease prognostication and delivers its benefits through researching on various existing works. It performs comparative analysis and critical assessment as encompassing accuracies and maximum utilization of algorithms focussed by traditional studies in this area. The major findings of the paper emphasized on the evolution and continuous explorations of AI techniques for heart disease prediction and the future researchers aims in determining the dimensions that have attained high and low prediction accuracies on which appropriate research works can be performed. Finally, future research is included to offer new stimulus for further investigation of AI in cardiac disease diagnosis.
Collapse
Affiliation(s)
- A Jainul Fathima
- Assistant Professor, IT Francis Xavier Engineering College, Tirunelveli - 627003, India
| | | |
Collapse
|
7
|
Atimbire SA, Appati JK, Owusu E. Empirical exploration of whale optimisation algorithm for heart disease prediction. Sci Rep 2024; 14:4530. [PMID: 38402276 PMCID: PMC10894250 DOI: 10.1038/s41598-024-54990-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 02/19/2024] [Indexed: 02/26/2024] Open
Abstract
Heart Diseases have the highest mortality worldwide, necessitating precise predictive models for early risk assessment. Much existing research has focused on improving model accuracy with single datasets, often neglecting the need for comprehensive evaluation metrics and utilization of different datasets in the same domain (heart disease). This research introduces a heart disease risk prediction approach by harnessing the whale optimization algorithm (WOA) for feature selection and implementing a comprehensive evaluation framework. The study leverages five distinct datasets, including the combined dataset comprising the Cleveland, Long Beach VA, Switzerland, and Hungarian heart disease datasets. The others are the Z-AlizadehSani, Framingham, South African, and Cleveland heart datasets. The WOA-guided feature selection identifies optimal features, subsequently integrated into ten classification models. Comprehensive model evaluation reveals significant improvements across critical performance metrics, including accuracy, precision, recall, F1 score, and the area under the receiver operating characteristic curve. These enhancements consistently outperform state-of-the-art methods using the same dataset, validating the effectiveness of our methodology. The comprehensive evaluation framework provides a robust assessment of the model's adaptability, underscoring the WOA's effectiveness in identifying optimal features in multiple datasets in the same domain.
Collapse
Affiliation(s)
| | | | - Ebenezer Owusu
- Department of Computer Science, University of Ghana, Accra, Ghana
| |
Collapse
|
8
|
Xu L, Guo C, Liu M. A weighted distance-based dynamic ensemble regression framework for gastric cancer survival time prediction. Artif Intell Med 2024; 147:102740. [PMID: 38184344 DOI: 10.1016/j.artmed.2023.102740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 10/28/2023] [Accepted: 11/28/2023] [Indexed: 01/08/2024]
Abstract
Accurate prediction of gastric cancer patient survival time is essential for clinical decision-making. However, unified static models lack specificity and flexibility in predictions owing to the varying survival outcomes among gastric cancer patients. We address these problems by using an ensemble learning approach and adaptively assigning greater weights to similar patients to make more targeted predictions when predicting an individual's survival time. We treat these problems as regression problems and introduce a weighted dynamic ensemble regression framework. To better identify similar patients, we devise a method to measure patient similarity, considering the diverse impacts of features. Subsequently, we use this measure to design both a weighted K-means clustering method and a fuzzy K-means sampling technique to group patients and train corresponding base regressors. To achieve more targeted predictions, we calculate the weight of each base regressor based on the similarity between the patient to be predicted and the patient clusters, culminating in the integration of the results. The model is validated on a dataset of 7791 patients, outperforming other models in terms of three evaluation metrics, namely, the root mean square error, mean absolute error, and the coefficient of determination. The weighted dynamic ensemble regression strategy can improve the baseline model by 1.75%, 2.12%, and 13.45% in terms of the three respective metrics while also mitigating the imbalanced survival time distribution issue. This enhanced performance has been statistically validated, even when tested on six public datasets with different sizes. By considering feature variations, patients with distinct survival profiles can be effectively differentiated, and the model predictive performance can be enhanced. The results generated by our proposed model can be invaluable in guiding decisions related to treatment plans and resource allocation. Furthermore, the model has the potential for broader applications in prognosis for other types of cancers or similar regression problems in various domains.
Collapse
Affiliation(s)
- Liangchen Xu
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| | - Chonghui Guo
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| | - Mucan Liu
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| |
Collapse
|
9
|
Geng C, Wang Z, Tang Y. Machine learning in Alzheimer's disease drug discovery and target identification. Ageing Res Rev 2024; 93:102172. [PMID: 38104638 DOI: 10.1016/j.arr.2023.102172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/28/2023] [Accepted: 12/13/2023] [Indexed: 12/19/2023]
Abstract
Alzheimer's disease (AD) stands as a formidable neurodegenerative ailment that poses a substantial threat to the elderly population, with no known curative or disease-slowing drugs in existence. Among the vital and time-consuming stages in the drug discovery process, disease modeling and target identification hold particular significance. Disease modeling allows for a deeper comprehension of disease progression mechanisms and potential therapeutic avenues. On the other hand, target identification serves as the foundational step in drug development, exerting a profound influence on all subsequent phases and ultimately determining the success rate of drug development endeavors. Machine learning (ML) techniques have ushered in transformative breakthroughs in the realm of target discovery. Leveraging the strengths of large dataset analysis, multifaceted data processing, and the exploration of intricate biological mechanisms, ML has become instrumental in the quest for effective AD treatments. In this comprehensive review, we offer an account of how ML methodologies are being deployed in the pursuit of drug discovery for AD. Furthermore, we provide an overview of the utilization of ML in uncovering potential intervention strategies and prospective therapeutic targets for AD. Finally, we discuss the principal challenges and limitations currently faced by these approaches. We also explore the avenues for future research that hold promise in addressing these challenges.
Collapse
Affiliation(s)
- Chaofan Geng
- Department of Neurology & Innovation Center for Neurological Disorders, Xuanwu Hospital, Capital Medical University, National Center for Neurological Disorders, Beijing, China
| | - ZhiBin Wang
- Department of Neurology & Innovation Center for Neurological Disorders, Xuanwu Hospital, Capital Medical University, National Center for Neurological Disorders, Beijing, China
| | - Yi Tang
- Department of Neurology & Innovation Center for Neurological Disorders, Xuanwu Hospital, Capital Medical University, National Center for Neurological Disorders, Beijing, China; Neurodegenerative Laboratory of Ministry of Education of the People's Republic of China, Beijing, China.
| |
Collapse
|
10
|
Chen PS, Lai CH, Chen YT, Lung TY. Developing a prototype system of computer-aided appointment scheduling: A radiology department case study. Technol Health Care 2024; 32:997-1013. [PMID: 37545282 DOI: 10.3233/thc-230374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
BACKGROUND Scheduling patient appointments in hospitals is complicated due to various types of patient examinations, different departments and physicians accessed, and different body parts affected. OBJECTIVE This study focuses on the radiology scheduling problem, which involves multiple radiological technologists in multiple examination rooms, and then proposes a prototype system of computer-aided appointment scheduling based on information such as the examining radiological technologists, examination departments, the patient's body parts being examined, the patient's gender, and the patient's age. METHODS The system incorporated a stepwise multiple regression analysis (SMRA) model to predict the number of examination images and then used the K-Means clustering with a decision tree classification model to classify the patient's examination time within an appropriate time interval. RESULTS The constructed prototype creates a feasible patient appointment schedule by classifying patient examination times into different categories for different patients according to the four types of body parts, eight hospital departments, and 10 radiological technologists. CONCLUSION The proposed patient appointment scheduling system can schedule appointment times for different types of patients according to the type of visit, thereby addressing the challenges associated with diversity and uncertainty in radiological examination services. It can also improve the quality of medical treatment.
Collapse
Affiliation(s)
- Ping-Shun Chen
- Department of Industrial and Systems Engineering, Chung Yuan Christian University, Taoyuan, Taiwan
| | - Chin-Hui Lai
- Department of Information Management, Chung Yuan Christian University, Taoyuan, Taiwan
| | - Ying-Tzu Chen
- Department of Industrial and Systems Engineering, Chung Yuan Christian University, Taoyuan, Taiwan
| | - Ting-Yu Lung
- Department of Industrial and Systems Engineering, Chung Yuan Christian University, Taoyuan, Taiwan
| |
Collapse
|
11
|
Wang K, Tang Y, Zhang F, Guo X, Gao L. Combined application of inflammation-related biomarkers to predict postoperative complications of rectal cancer patients: a retrospective study by machine learning analysis. Langenbecks Arch Surg 2023; 408:400. [PMID: 37831218 DOI: 10.1007/s00423-023-03127-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 09/29/2023] [Indexed: 10/14/2023]
Abstract
BACKGROUND Postoperative complications in patients of rectal cancer pose challenges to postoperative recovery. Accurately predicting these complications is crucial for developing effective treatment plans for patients. METHODS In this retrospective study, 493 patients with rectal cancer who underwent radical resection between January 2020 and December 2021 were examined. We evaluated logistic regression, support vector machines, regression trees, and random forests to predict the incidence of postoperative complications in patients and evaluate the performance of the model. The results will be analyzed to make recommendations for reducing complications. RESULTS Among the four machine learning models, random forest demonstrated the highest results. The performance of this model was showed with an AUC of 0.880 (95% CI 0.807-0.949), an accuracy of 88.0% (95% CI 0.815-0.929), a sensitivity of 96.6%, and a specificity of 45.8%. Notably, factors such as inflammation related prognostic index, prognostic nutritional index, tumor location, and T stage were found to significantly increase the probability of postoperative complications. CONCLUSION Our study provided evidence that machine learning models can effectively evaluate early postoperative complications of the patients after surgery.
Collapse
Affiliation(s)
- Kunyue Wang
- Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, 215006, Jiangsu Province, China
| | - Youyuan Tang
- Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, 215006, Jiangsu Province, China
| | - Feng Zhang
- Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, 215006, Jiangsu Province, China
| | - Xingpo Guo
- Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, 215006, Jiangsu Province, China.
| | - Ling Gao
- Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, 215006, Jiangsu Province, China.
| |
Collapse
|
12
|
Mahajan P, Uddin S, Hajati F, Moni MA. Ensemble Learning for Disease Prediction: A Review. Healthcare (Basel) 2023; 11:1808. [PMID: 37372925 DOI: 10.3390/healthcare11121808] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 06/19/2023] [Accepted: 06/19/2023] [Indexed: 06/29/2023] Open
Abstract
Machine learning models are used to create and enhance various disease prediction frameworks. Ensemble learning is a machine learning technique that combines multiple classifiers to improve performance by making more accurate predictions than a single classifier. Although numerous studies have employed ensemble approaches for disease prediction, there is a lack of thorough assessment of commonly used ensemble approaches against highly researched diseases. Consequently, this study aims to identify significant trends in the performance accuracies of ensemble techniques (i.e., bagging, boosting, stacking, and voting) against five hugely researched diseases (i.e., diabetes, skin disease, kidney disease, liver disease, and heart conditions). Using a well-defined search strategy, we first identified 45 articles from the current literature that applied two or more of the four ensemble approaches to any of these five diseases and were published in 2016-2023. Although stacking has been used the fewest number of times (23) compared with bagging (41) and boosting (37), it showed the most accurate performance the most times (19 out of 23). The voting approach is the second-best ensemble approach, as revealed in this review. Stacking always revealed the most accurate performance in the reviewed articles for skin disease and diabetes. Bagging demonstrated the best performance for kidney disease (five out of six times) and boosting for liver and diabetes (four out of six times). The results show that stacking has demonstrated greater accuracy in disease prediction than the other three candidate algorithms. Our study also demonstrates variability in the perceived performance of different ensemble approaches against frequently used disease datasets. The findings of this work will assist researchers in better understanding current trends and hotspots in disease prediction models that employ ensemble learning, as well as in determining a more suitable ensemble model for predictive disease analytics. This article also discusses variability in the perceived performance of different ensemble approaches against frequently used disease datasets.
Collapse
Affiliation(s)
- Palak Mahajan
- College of Engineering and Science, Victoria University, Sydney, NSW 2000, Australia
| | - Shahadat Uddin
- School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW 2037, Australia
| | - Farshid Hajati
- College of Engineering and Science, Victoria University, Sydney, NSW 2000, Australia
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St. Lucia, QLD 4072, Australia
| |
Collapse
|
13
|
Liu ZW, Chen G, Dong CF, Qiu WR, Zhang SH. Intelligent assistant diagnosis for pediatric inguinal hernia based on a multilayer and unbalanced classification model. Front Physiol 2023; 14:1105891. [PMID: 36998990 PMCID: PMC10043203 DOI: 10.3389/fphys.2023.1105891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 02/27/2023] [Indexed: 03/17/2023] Open
Abstract
As one of the most common diseases in pediatric surgery, an inguinal hernia is usually diagnosed by medical experts based on clinical data collected from magnetic resonance imaging (MRI), computed tomography (CT), or B-ultrasound. The parameters of blood routine examination, such as white blood cell count and platelet count, are often used as diagnostic indicators of intestinal necrosis. Based on the medical numerical data on blood routine examination parameters and liver and kidney function parameters, this paper used machine learning algorithm to assist the diagnosis of intestinal necrosis in children with inguinal hernia before operation. In the work, we used clinical data consisting of 3,807 children with inguinal hernia symptoms and 170 children with intestinal necrosis and perforation caused by the disease. Three different models were constructed according to the blood routine examination and liver and kidney function. Some missing values were replaced by using the RIN-3M (median, mean, or mode region random interpolation) method according to the actual necessity, and the ensemble learning based on the voting principle was used to deal with the imbalanced datasets. The model trained after feature selection yielded satisfactory results with an accuracy of 86.43%, sensitivity of 84.34%, specificity of 96.89%, and AUC value of 0.91. Therefore, the proposed methods may be a potential idea for auxiliary diagnosis of inguinal hernia in children.
Collapse
Affiliation(s)
- Zhi-Wen Liu
- Department of General Surgery, Jiangxi Provincial Children’s Hospital, Nanchang, China
| | - Gang Chen
- Computer Department, Jing-De-Zhen Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Chao-Fan Dong
- Department of General Surgery, Jingdezhen No. 1 People’s Hospital, Jingdezhen, China
| | - Wang-Ren Qiu
- Computer Department, Jing-De-Zhen Jingdezhen Ceramic Institute, Jingdezhen, China
- *Correspondence: Wang-Ren Qiu, , ; Shou-Hua Zhang,
| | - Shou-Hua Zhang
- Department of General Surgery, Jiangxi Provincial Children’s Hospital, Nanchang, China
- *Correspondence: Wang-Ren Qiu, , ; Shou-Hua Zhang,
| |
Collapse
|
14
|
Wu X, Wang J. Application of Bagging, Boosting and Stacking Ensemble and EasyEnsemble Methods for Landslide Susceptibility Mapping in the Three Gorges Reservoir Area of China. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:4977. [PMID: 36981886 PMCID: PMC10049250 DOI: 10.3390/ijerph20064977] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/06/2023] [Accepted: 03/07/2023] [Indexed: 06/18/2023]
Abstract
Since the impoundment of the Three Gorges Reservoir area in 2003, the potential risks of geological disasters in the reservoir area have increased significantly, among which the hidden dangers of landslides are particularly prominent. To reduce casualties and damage, efficient and precise landslide susceptibility evaluation methods are important. Multiple ensemble models have been used to evaluate the susceptibility of the upper part of Badong County to landslides. In this study, EasyEnsemble technology was used to solve the imbalance between landslide and nonlandslide sample data. The extracted evaluation factors were input into three bagging, boosting, and stacking ensemble models for training, and landslide susceptibility mapping (LSM) was drawn. According to the importance analysis, the important factors affecting the occurrence of landslides are altitude, terrain surface texture (TST), distance to residences, distance to rivers and land use. The influences of different grid sizes on the susceptibility results were compared, and a larger grid was found to lead to the overfitting of the prediction results. Therefore, a 30 m grid was selected as the evaluation unit. The accuracy, area under the curve (AUC), recall rate, test set precision, and kappa coefficient of a multi-grained cascade forest (gcForest) model with the stacking method were 0.958, 0.991, 0.965, 0.946, and 0.91, respectively, which a significantly better than the values produced by the other models.
Collapse
|
15
|
Han W, Kang X, He W, Jiang L, Li H, Xu B. A new method for disease diagnosis based on hierarchical BRB with power set. Heliyon 2023; 9:e13619. [PMID: 36852081 PMCID: PMC9957705 DOI: 10.1016/j.heliyon.2023.e13619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 01/31/2023] [Accepted: 02/06/2023] [Indexed: 02/13/2023] Open
Abstract
Disease diagnosis occupies an important position in the medical field. The diagnosis of the disease is the basis for choosing the right treatment plan. Doctors must first diagnose what the patient has based on the clinical characteristics of various diseases, and then they can administer the right medicine. When building models for disease diagnosis, models are required to be able to handle various uncertainty information. The belief rule base (BRB) can effectively handle various information under uncertainty by introducing belief distributions. However, in current research, BRB-based disease diagnosis models still have problems of combinatorial rule explosion and inability to deal with local ignorance effectively. Therefore, a hierarchical BRB with power set (H-BRBp)-based disease diagnosis model is proposed in this paper. First, the physiological indexes and data of the patients were analyzed, and the data were preprocessed using the principal component regression (PCR) algorithm. Second, the H-BRBp disease diagnosis model was constructed to solve the deficiencies in the above BRB disease diagnosis model. Finally, the validity and advantages of the model were verified by experiments on lumbar spine disease diagnosis and a large number of comparison experiments.
Collapse
Affiliation(s)
- Wence Han
- Harbin Normal University, Harbin 150025, China
| | - Xiao Kang
- Harbin Normal University, Harbin 150025, China
| | - Wei He
- Harbin Normal University, Harbin 150025, China.,Rocket Force University of Engineering, Xi'an 710025, China
| | - Li Jiang
- Harbin Medical University Cancer Hospital, China
| | - Hongyu Li
- Harbin Normal University, Harbin 150025, China
| | - Bing Xu
- Harbin Normal University, Harbin 150025, China
| |
Collapse
|
16
|
Liu Y, Xu Y, Yang X, Miao G, Wu Y, Yang S. The prevalence of anxiety and its key influencing factors among the elderly in China. Front Psychiatry 2023; 14:1038049. [PMID: 36816413 PMCID: PMC9932967 DOI: 10.3389/fpsyt.2023.1038049] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 01/17/2023] [Indexed: 02/05/2023] Open
Abstract
INTRODUCTION With the rapid aging population, the mental health of older adults is paid more and more attention. Anxiety is a common mental health illness in older adults. Therefore, the study aimed to explore the current situation of anxiety and its factors among the elderly in China. METHODS Based on the data from 2018 Chinese Longitudinal Healthy Longevity Survey (CLHLS), a total of 10,982 respondents aged 60 and above were selected. Generalized Anxiety Disorder (GAD-7) scale was used to assess the anxiety. Univariate and multivariate analysis were used to analyze the influencing factors of anxiety. Random forest was established to rank the importance of each influencing factors. RESULTS The results showed that the prevalence of anxiety among the elderly was 11.24%. Anxiety was mainly associated with 14 factors from five aspects: sociodemographic characteristics, health status, psychological state, social trust and social participation, among which loneliness related to psychological status was the most important factor. DISCUSSION The revelation of this study is that the present situation of anxiety among the elderly cannot be ignored, and it is necessary to take measures to prevent and control it from many aspects.
Collapse
Affiliation(s)
- Yixuan Liu
- Department of Social Medicine and Health Management, School of Public Health, Jilin University, Changchun, China
| | - Yanling Xu
- Department of Social Medicine and Health Management, School of Public Health, Jilin University, Changchun, China
| | - Xinyan Yang
- Department of Social Medicine and Health Management, School of Public Health, Jilin University, Changchun, China
| | - Guomei Miao
- Department of Social Medicine and Health Management, School of Public Health, Jilin University, Changchun, China
| | - Yinghui Wu
- Department of Social Medicine and Health Management, School of Public Health, Jilin University, Changchun, China
| | - Shujuan Yang
- Department of Social Medicine and Health Management, School of Public Health, Jilin University, Changchun, China
| |
Collapse
|
17
|
Analytical Comparison of Risk Prediction Models for the Onset of Macrosomia Based on Three Statistical Methods. DISEASE MARKERS 2022; 2022:9073043. [PMID: 36124028 PMCID: PMC9482546 DOI: 10.1155/2022/9073043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/29/2022] [Accepted: 09/01/2022] [Indexed: 11/18/2022]
Abstract
Background and Purpose. Fetal overgrowth can pose a serious threat to the safety of a mother and child. Early identification of high-risk pregnant women and timely pregnancy intervention and guidance are of great value in preventing the development of giant babies and improving adverse maternal and infant outcomes. The current clinical methods for predicting macrosomia mainly rely on obstetric examination and imaging, but their accuracy is controversial. And there is no accepted method for accurately predicting macrosomia. We investigated the risk factors influencing the occurrence of macrosomia and established a prediction model for the occurrence of macrosomia to provide a reference basis for interventions to prevent macrosomia. Method. A retrospective selection of 93 women who were hospitalized in our hospital from March 2019 to May 2022 with a singleton pregnancy and delivered at term with macrosomia were the study group. And 356 women who delivered a normal size baby during the same period were the control group. The variables that were associated with the onset of macrosomia were screened from maternal medical records. Logistic regression models, random forest, and CART decision tree models were developed using the screened variables as input variables and whether they were macrosomia as outcome variables, respectively. The performance of the three models was evaluated by accuracy, precision, recall, F1 score, and receiver operating characteristic curve (ROC). Result. The risk prediction models for the onset of macrosomia, logistic regression model, random forest model, and decision tree, were successfully developed, with accuracies of 0.904, 1.000, and 0.901 in the training set and 0.926, 0.582, and 0.852 in the validation set, respectively. The AUC in the training set were 0.898, 1.000, and 0.789, and in the validation set were 0.906, 0.913, and 0.731, respectively. In general, the logistic regression model has the highest diagnostic efficiency, followed by the random forest model. Conclusion. Logistic regression models have high application value in the assessment of predicting the risk of macrosomia, and it is suggested that the advantages of logistic regression models and random forest models should be combined in future studies and applications to make them work better in the prediction of the risk of macrosomia.
Collapse
|
18
|
Zhao X, Sui H, Yan C, Zhang M, Song H, Liu X, Yang J. Machine-Based Learning Shifting to Prediction Model of Deteriorative MCI Due to Alzheimer's Disease - A Two-Year Follow-Up Investigation. Curr Alzheimer Res 2022; 19:708-715. [PMID: 36278469 DOI: 10.2174/1567205020666221019122049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 08/26/2022] [Accepted: 08/26/2022] [Indexed: 01/27/2023]
Abstract
OBJECTIVE The aim of the present work was to investigate the features of the elderly population aged ≥65 yrs and with deteriorative mild cognitive impairment (MCI) due to Alzheimer's disease (AD) to establish a prediction model. METHODS A total of 105 patients aged ≥65 yrs and with MCI were followed up, with a collection of 357 features, which were derived from the demographic characteristics, hematological indicators (serum Aβ1-40, Aβ1-42, P-tau and MCP-1 levels, APOE gene), and multimodal brain Magnetic Resonance Imaging (MRI) imaging indicators of 116 brain regions (ADC, FA and CBF values). Cognitive function was followed up for 2 yrs. Based on the Python platform Anaconda, 105 patients were randomly divided into a training set (70%) and a test set (30%) by analyzing all features through a random forest algorithm, and a prediction model was established for the form of rapidly deteriorating MCI. RESULTS Of the 105 patients enrolled, 41 deteriorated, and 64 did not come within 2 yrs. Model 1 was established based on demographic characteristics, hematological indicators and multi-modal MRI image features, the accuracy of the training set being 100%, the accuracy of the test set 64%, sensitivity 50%, specificity 67%, and AUC 0.72. Model 2 was based on the first five features (APOE4 gene, FA value of left fusiform gyrus, FA value of left inferior temporal gyrus, FA value of left parahippocampal gyrus, ADC value of right calcarine fissure as surrounding cortex), the accuracy of the training set being 100%, the accuracy of the test set 85%, sensitivity 91%, specificity 80% and AUC 0.96. Model 3 was based on the first four features of Model 1, the accuracy of the training set is 100%, the accuracy of the test set 97%, sensitivity100%, specificity 95% and AUC 0.99. Model 4 was based on the first three characteristics of Model 1, the accuracy of the training set being 100%, the accuracy of the test set 94%, sensitivity 92%, specificity 94% and AUC 0.96. Model 5 was based on the hematological characteristics, the accuracy of the training set is 100%, the accuracy of the test set 91%, sensitivity 100%, specificity 88% and AUC 0.97. The models based on the demographic characteristics, imaging characteristics FA, CBF and ADC values had lower sensitivity and specificity. CONCLUSION Model 3, which has four important predictive characteristics, can predict the rapidly deteriorating MCI due to AD in the community.
Collapse
Affiliation(s)
- Xiaohui Zhao
- Department of Neurology, Shanghai Pudong New Area People's Hospital, Shanghai, People's Republic of China
| | - Haijing Sui
- Department of Radiology, Shanghai Pudong New Area People's Hospital, Shanghai, People's Republic of China
| | - Chengong Yan
- Department of Social Work, Shanghai Pudong New Area People's Hospital, Shanghai, People's Republic of China
| | - Min Zhang
- Department of Deep Learning and Artificial Intelligence, hcit.ai Co., Shanghai, People's Republic of China
| | - Haihan Song
- The Central Lab, Pudong New Area People's Hospital, Shanghai, China
| | - Xueyuan Liu
- Department of Social Work, Shanghai Pudong New Area People's Hospital, Shanghai, People's Republic of China
| | - Juan Yang
- Department of Neurology, Shanghai Pudong New Area People's Hospital, Shanghai, People's Republic of China
| |
Collapse
|
19
|
Huang F, Zhang S, Li X, Huang Y, He S, Luo L. STAT3-mediated ferroptosis is involved in ulcerative colitis. Free Radic Biol Med 2022; 188:375-385. [PMID: 35779691 DOI: 10.1016/j.freeradbiomed.2022.06.242] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 06/16/2022] [Accepted: 06/27/2022] [Indexed: 12/13/2022]
Abstract
Ferroptosis is a form of iron-dependent lipid peroxidation cell death that plays an important role in inflammation. However, the mechanism of ferroptosis in ulcerative colitis (UC) remains to be further investigated. In the present study, we merged the differentially expressed genes (DEGs) of UC in GEO database with the ferroptosis-related genes of FerrDb for bioinformatics analysis and successfully screened out the ferroptosis-related hub gene STAT3 (signal transducer and activator of transcription 3). Then we further validated the role of STAT3-mediated ferroptosis in vitro and in vivo models of colitis. The results showed that ferroptosis was increased in DSS-induced colitis, salmonella typhimurium (S. Tm) colitis and H2O2-induced IEC-6 cells. And the phosphorylation level of the hub gene STAT3 was down-regulated in IEC-6 cells treated with H2O2, while Fer-1, an ferroptosis inhibitor, reactivated the phosphorylation level of STAT3. In addition, co-treatment of cells with H2O2 and STAT3 inhibitor (stattic) showed an additive effect on the extent of ferroptosis. Taken together, these findings suggest that ferroptosis is closely associated with the development of colitis and ferroptosis-related gene STAT3 could serve as a potential biomarker for diagnosis and treatment of ulcerative colitis.
Collapse
Affiliation(s)
- Fangfang Huang
- Graduate School, Guangdong Medical University, Zhanjiang, Guangdong, 524023, China; Department of Pediatrics, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, Guangdong, 524023, China
| | - Suzhou Zhang
- The First Clinical College, Guangdong Medical University, Zhanjiang, Guangdong, 524023, China
| | - Xiaoling Li
- Experimental Animal Center, Guangdong Medical University, Zhanjiang, Guangdong, 524023, China
| | - Yuge Huang
- Department of Pediatrics, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, Guangdong, 524023, China.
| | - Shasha He
- Beijing Hospital of Traditional Chinese Medicine, Capital Medical University, Beijing Institute of Chinese Medicine, Beijing, 100000, China.
| | - Lianxiang Luo
- The Marine Biomedical Research Institute, Guangdong Medical University, Zhanjiang, Guangdong, 524023, China; The Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang, Guangdong, 524023, China.
| |
Collapse
|
20
|
Pan C, Poddar A, Mukherjee R, Ray AK. Impact of categorical and numerical features in ensemble machine learning frameworks for heart disease prediction. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
21
|
Yuan X, Chen S, Sun C, Yuwen L. A novel early diagnostic framework for chronic diseases with class imbalance. Sci Rep 2022; 12:8614. [PMID: 35597855 PMCID: PMC9123399 DOI: 10.1038/s41598-022-12574-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/12/2022] [Indexed: 11/09/2022] Open
Abstract
Chronic diseases are one of the most severe health issues in the world, due to their terrible clinical presentations such as long onset cycle, insidious symptoms, and various complications. Recently, machine learning has become a promising technique to assist the early diagnosis of chronic diseases. However, existing works ignore the problems of feature hiding and imbalanced class distribution in chronic disease datasets. In this paper, we present a universal and efficient diagnostic framework to alleviate the above two problems for diagnosing chronic diseases timely and accurately. Specifically, we first propose a network-limited polynomial neural network (NLPNN) algorithm to efficiently capture high-level features hidden in chronic disease datasets, which is data augmentation in terms of its feature space and can also avoid over-fitting. Then, to alleviate the class imbalance problem, we further propose an attention-empowered NLPNN algorithm to improve the diagnostic accuracy for sick cases, which is also data augmentation in terms of its sample space. We evaluate the proposed framework on nine public and two real chronic disease datasets (partly with class imbalance). Extensive experiment results demonstrate that the proposed diagnostic algorithms outperform state-of-the-art machine learning algorithms, and can achieve superior performances in terms of accuracy, recall, F1, and G_mean. The proposed framework can help to diagnose chronic diseases timely and accurately at an early stage.
Collapse
Affiliation(s)
- Xiaohan Yuan
- School of Big Data and Software Engineering, Chongqing University, Chongqing, China
| | - Shuyu Chen
- School of Big Data and Software Engineering, Chongqing University, Chongqing, China.
| | - Chuan Sun
- School of Big Data and Software Engineering, Chongqing University, Chongqing, China
| | - Lu Yuwen
- School of Big Data and Software Engineering, Chongqing University, Chongqing, China
| |
Collapse
|
22
|
Automatic Classification and Coding of Prefabricated Components Using IFC and the Random Forest Algorithm. BUILDINGS 2022. [DOI: 10.3390/buildings12050688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The management of prefabricated component staging and turnover lacks the effective integration of informatization and complexity, as relevant information is stored in the heterogeneous systems of various stakeholders. BIM and its underlying data schema, IFC, provide for information collaboration and sharing. In this paper, an automatic classification and coding system for prefabricated building, based on BIM technology and Random Forest, is developed so as to enable the unique representation of components. The proposed approach starts with classifying and coding information regarding the overall design of the components. With the classification criteria, the required attributes of the components are extracted, and the process of attribute extraction is illustrated in detail using wall components as an example. The Random Forest model is then employed for IFC building component classification training and testing, which includes the selection of the datasets, the construction of CART, and the voting of the component classification results. The experiment results illustrate that the approach can automate the uniform and unique coding of each component on a Python basis, while also reducing the workload of designers. Finally, based on the IFC physical file, an extended implementation process for component encoding information is designed to achieve information integrity for prefabricated component descriptions. Additionally, in the subsequent research, it can be further combined with Internet-of-Things technology to achieve the real-time collection of construction process information and the real-time control of building components.
Collapse
|
23
|
On the Scale Effect of Relationship Identification between Land Surface Temperature and 3D Landscape Pattern: The Application of Random Forest. REMOTE SENSING 2022. [DOI: 10.3390/rs14020279] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Urbanization processes greatly change urban landscape patterns and the urban thermal environment. Significant multi-scale correlation exists between the land surface temperature (LST) and landscape pattern. Compared with traditional linear regression methods, the regression model based on random forest has the advantages of higher accuracy and better learning ability, and can remove the linear correlation between regression features. Taking Beijing’s metropolitan area as an example, this paper conducted multi-scale relationship analysis between 3D landscape patterns and LST using Pearson Correlation Coefficient (PCC), Multiple Linear Regression and Random Forest Regression (RFR). The results indicated that LST was relatively high in the central area of Beijing, and decreased from the center to the surrounding areas. The interpretation effect of 3D landscape metrics on LST was more obvious than that of the 2D landscape metrics, and 3D landscape diversity and evenness played more important roles than the other metrics in the change of LST. The multi-scale relationship between LST and the landscape pattern was discovered in the fourth ring road of Beijing, the effect of the extent of change on the landscape pattern is greater than that of the grain size change, and the interpretation effect and correlation of landscape metrics on LST increase with the increase in the rectangle size. Impervious surfaces significantly increased the LST, while the impervious surfaces located at low building areas were more likely to increase LST than those located at tall building areas. It seems that increasing the distance between buildings to improve the rate of energy exchange between urban and rural areas can effectively decrease LST. Vegetation and water can effectively reduce LST, but large, clustered and irregularly shaped patches have a better effect on land surface cooling than small and discrete patches. The Coefficients of Rectangle Variation (CORV) power function fitting results of landscape metrics showed that the optimal rectangle size for studying the relationship between the 3D landscape pattern and LST is about 700 m. Our study is useful for future urban planning and provides references to mitigate the daytime urban heat island (UHI) effect.
Collapse
|
24
|
Prasanna SL, Challa NP. Heart Disease Prediction Using Optimal Mayfly Technique with Ensemble Models. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH 2022. [DOI: 10.4018/ijsir.313665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
This paper proposes a methodology consisting of two phases: attributes selection and classification based on the attributes selected. Phase 1 uses the introduced new feature selection algorithm which is the optimal mayfly algorithm (OMA) to solve the feature selection technique problem. Mayfly algorithm has derived features of physiological and anatomical relevance, like ST depression, the highest heart rate, cholesterol, chest pain, and heart vessels. In the second phase, the selected attributes use the ensemble classifiers like random subspace, bagging, and boosting. Optimal mayfly algorithm (OMA) with boosting technique had the highest accuracy. Therefore, true disease, false disease, accuracy, and specificity are measured to evaluate the proposed system's efficiency. It has been discovered that the proposed method, which combines feature selection and ensemble techniques performs well, the performance of the optimal mayfly algorithm along with ensemble classifiers of boosting method with a model accuracy of 97.12% which is the highest accuracy value compared to any single model.
Collapse
|