1
|
Fereidooni D, Karimi Z, Ghasemi F. Non-destructive test-based assessment of uniaxial compressive strength and elasticity modulus of intact carbonate rocks using stacking ensemble models. PLoS One 2024; 19:e0302944. [PMID: 38857272 PMCID: PMC11164374 DOI: 10.1371/journal.pone.0302944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/14/2024] [Indexed: 06/12/2024] Open
Abstract
The uniaxial compressive strength (UCS) and elasticity modulus (E) of intact rock are two fundamental requirements in engineering applications. These parameters can be measured either directly from the uniaxial compressive strength test or indirectly by using soft computing predictive models. In the present research, the UCS and E of intact carbonate rocks have been predicted by introducing two stacking ensemble learning models from non-destructive simple laboratory test results. For this purpose, dry unit weight, porosity, P-wave velocity, Brinell surface harnesses, UCS, and static E were measured for 70 carbonate rock samples. Then, two stacking ensemble learning models were developed for estimating the UCS and E of the rocks. The applied stacking ensemble learning method integrates the advantages of two base models in the first level, where base models are multi-layer perceptron (MLP) and random forest (RF) for predicting UCS, and support vector regressor (SVR) and extreme gradient boosting (XGBoost) for predicting E. Grid search integrating k-fold cross validation is applied to tune the parameters of both base models and meta-learner. The results demonstrate the generalization ability of the stacking ensemble method in the comparison of base models in the terms of common performance measures. The values of coefficient of determination (R2) obtained from the stacking ensemble are 0.909 and 0.831 for predicting UCS and E, respectively. Similarly, the stacking ensemble yielded Root Mean Squared Error (RMSE) values of 1.967 and 0.621 for the prediction of UCS and E, respectively. Accordingly, the proposed models have superiority in the comparison of SVR and MLP as single models and RF and XGBoost as two representative ensemble models. Furthermore, sensitivity analysis is carried out to investigate the impact of input parameters.
Collapse
Affiliation(s)
| | - Zohre Karimi
- School of Engineering, Damghan University, Damghan, Semnan, Iran
| | - Fatemeh Ghasemi
- School of Earth Sciences, Damghan University, Damghan, Semnan, Iran
| |
Collapse
|
2
|
Zhang X, Chen S, Zhang P, Wang C, Wang Q, Zhou X. Staging of Liver Fibrosis Based on Energy Valley Optimization Multiple Stacking (EVO-MS) Model. Bioengineering (Basel) 2024; 11:485. [PMID: 38790352 PMCID: PMC11117710 DOI: 10.3390/bioengineering11050485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 05/26/2024] Open
Abstract
Currently, staging the degree of liver fibrosis predominantly relies on liver biopsy, a method fraught with potential risks, such as bleeding and infection. With the rapid development of medical imaging devices, quantification of liver fibrosis through image processing technology has become feasible. Stacking technology is one of the effective ensemble techniques for potential usage, but precise tuning to find the optimal configuration manually is challenging. Therefore, this paper proposes a novel EVO-MS model-a multiple stacking ensemble learning model optimized by the energy valley optimization (EVO) algorithm to select most informatic features for fibrosis quantification. Liver contours are profiled from 415 biopsied proven CT cases, from which 10 shape features are calculated and inputted into a Support Vector Machine (SVM) classifier to generate the accurate predictions, then the EVO algorithm is applied to find the optimal parameter combination to fuse six base models: K-Nearest Neighbors (KNNs), Decision Tree (DT), Naive Bayes (NB), Extreme Gradient Boosting (XGB), Gradient Boosting Decision Tree (GBDT), and Random Forest (RF), to create a well-performing ensemble model. Experimental results indicate that selecting 3-5 feature parameters yields satisfactory results in classification, with features such as the contour roundness non-uniformity (Rmax), maximum peak height of contour (Rp), and maximum valley depth of contour (Rm) significantly influencing classification accuracy. The improved EVO algorithm, combined with a multiple stacking model, achieves an accuracy of 0.864, a precision of 0.813, a sensitivity of 0.912, a specificity of 0.824, and an F1-score of 0.860, which demonstrates the effectiveness of our EVO-MS model in staging the degree of liver fibrosis.
Collapse
Affiliation(s)
- Xuejun Zhang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, China
| | - Shengxiang Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Pengfei Zhang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Chun Wang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Qibo Wang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Xiangrong Zhou
- Department of Electrical, Electronic and Computer Engineering, Gifu University, Gifu 501-1193, Japan;
| |
Collapse
|
3
|
Peng T, Xiong J, Sun K, Qian S, Tao Z, Nazir MS, Zhang C. Research and application of a novel selective stacking ensemble model based on error compensation and parameter optimization for AQI prediction. ENVIRONMENTAL RESEARCH 2024; 247:118176. [PMID: 38215922 DOI: 10.1016/j.envres.2024.118176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/11/2023] [Accepted: 01/09/2024] [Indexed: 01/14/2024]
Abstract
With the ongoing process of industrialization, the issue of declining air quality is increasingly becoming a critical concern. Accurate prediction of the Air Quality Index (AQI), considered as an all-inclusive measure representing the extent of pollutants present in the atmosphere, is of paramount importance. This study introduces a novel methodology that combines stacking ensemble and error correction to improve AQI prediction. Additionally, the reptile search algorithm (RSA) is employed for optimizing model parameters. In this study, four distinct regional AQI data containing a collection of 34864 data samples are collected. Initially, we perform cross-validation on ten commonly used single models to obtain prediction results. Then, based on evaluation indices, five models are selected for ensemble. The results of the study show that the model proposed in this paper achieves an improvement of around 10% in terms of accuracy when compared to the conventional model. Thus, the model introduced in this study offers a more scientifically grounded approach in tackling air pollution.
Collapse
Affiliation(s)
- Tian Peng
- Faculty of Automation, Huaiyin Institute of Technology, Huai'an, 223003, China; Jiangsu Permanent Magnet Motor Engineering Research Center, Huaiyin Institute of Technology, Huai'an, 223003, China.
| | - Jinlin Xiong
- Faculty of Automation, Huaiyin Institute of Technology, Huai'an, 223003, China
| | - Kai Sun
- Faculty of Automation, Huaiyin Institute of Technology, Huai'an, 223003, China
| | - Shijie Qian
- Faculty of Automation, Huaiyin Institute of Technology, Huai'an, 223003, China
| | - Zihan Tao
- Faculty of Automation, Huaiyin Institute of Technology, Huai'an, 223003, China
| | | | - Chu Zhang
- Faculty of Automation, Huaiyin Institute of Technology, Huai'an, 223003, China; Jiangsu Permanent Magnet Motor Engineering Research Center, Huaiyin Institute of Technology, Huai'an, 223003, China.
| |
Collapse
|
4
|
Wu JM, Qiu WR, Liu Z, Xu ZC, Zhang SH. Integrative approach for classifying male tumors based on DNA methylation 450K data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:19133-19151. [PMID: 38052593 DOI: 10.3934/mbe.2023845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Malignancies such as bladder urothelial carcinoma, colon adenocarcinoma, liver hepatocellular carcinoma, lung adenocarcinoma and prostate adenocarcinoma significantly impact men's well-being. Accurate cancer classification is vital in determining treatment strategies and improving patient prognosis. This study introduced an innovative method that utilizes gene selection from high-dimensional datasets to enhance the performance of the male tumor classification algorithm. The method assesses the reliability of DNA methylation data to distinguish the five most prevalent types of male cancers from normal tissues by employing DNA methylation 450K data obtained from The Cancer Genome Atlas (TCGA) database. First, the chi-square test is used for dimensionality reduction and second, L1 penalized logistic regression is used for feature selection. Furthermore, the stacking ensemble learning technique was employed to integrate seven common multiclassification models. Experimental results demonstrated that the ensemble learning model utilizing multiple classification models outperformed any base classification model. The proposed ensemble model achieved an astonishing overall accuracy (ACC) of 99.2% in independent testing data. Moreover, it may present novel ideas and pathways for the early detection and treatment of future diseases.
Collapse
Affiliation(s)
- Ji-Ming Wu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Wang-Ren Qiu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Zi Liu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Zhao-Chun Xu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Shou-Hua Zhang
- Department of General Surgery, Jiangxi Provincial Children's Hospital, Nanchang 330006, China
| |
Collapse
|
5
|
Patwary MSA, Das KP. Forecasting stock indices with the COVID-19 infection rate as an exogenous variable. PeerJ Comput Sci 2023; 9:e1532. [PMID: 37705632 PMCID: PMC10495988 DOI: 10.7717/peerj-cs.1532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 07/20/2023] [Indexed: 09/15/2023]
Abstract
Forecasting stock market indices is challenging because stock prices are usually nonlinear and non- stationary. COVID-19 has had a significant impact on stock market volatility, which makes forecasting more challenging. Since the number of confirmed cases significantly impacted the stock price index; hence, it has been considered a covariate in this analysis. The primary focus of this study is to address the challenge of forecasting volatile stock indices during Covid-19 by employing time series analysis. In particular, the goal is to find the best method to predict future stock price indices in relation to the number of COVID-19 infection rates. In this study, the effect of covariates has been analyzed for three stock indices: S & P 500, Morgan Stanley Capital International (MSCI) world stock index, and the Chicago Board Options Exchange (CBOE) Volatility Index (VIX). Results show that parametric approaches can be good forecasting models for the S & P 500 index and the VIX index. On the other hand, a random walk model can be adopted to forecast the MSCI index. Moreover, among the three random walk forecasting methods for the MSCI index, the naïve method provides the best forecasting model.
Collapse
Affiliation(s)
| | - Kumer Pial Das
- Research, Innovation, and Economic Development, University of Louisiana at Lafayette, Lafayette, LA, United States of America
| |
Collapse
|
6
|
Ribeiro MHDM, da Silva RG, Larcher JHK, Mendes A, Mariani VC, Coelho LDS. Decoding Electroencephalography Signal Response by Stacking Ensemble Learning and Adaptive Differential Evolution. SENSORS (BASEL, SWITZERLAND) 2023; 23:7049. [PMID: 37631586 PMCID: PMC10459492 DOI: 10.3390/s23167049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 07/29/2023] [Accepted: 08/02/2023] [Indexed: 08/27/2023]
Abstract
Electroencephalography (EEG) is an exam widely adopted to monitor cerebral activities regarding external stimuli, and its signals compose a nonlinear dynamical system. There are many difficulties associated with EEG analysis. For example, noise can originate from different disorders, such as muscle or physiological activity. There are also artifacts that are related to undesirable signals during EEG recordings, and finally, nonlinearities can occur due to brain activity and its relationship with different brain regions. All these characteristics make data modeling a difficult task. Therefore, using a combined approach can be the best solution to obtain an efficient model for identifying neural data and developing reliable predictions. This paper proposes a new hybrid framework combining stacked generalization (STACK) ensemble learning and a differential-evolution-based algorithm called Adaptive Differential Evolution with an Optional External Archive (JADE) to perform nonlinear system identification. In the proposed framework, five base learners, namely, eXtreme Gradient Boosting, a Gaussian Process, Least Absolute Shrinkage and Selection Operator, a Multilayer Perceptron Neural Network, and Support Vector Regression with a radial basis function kernel, are trained. The predictions from all these base learners compose STACK's layer-0 and are adopted as inputs of the Cubist model, whose hyperparameters were obtained by JADE. The model was evaluated for decoding the electroencephalography signal response to wrist joint perturbations. The variance accounted for (VAF), root-mean-squared error (RMSE), and Friedman statistical test were used to validate the performance of the proposed model and compare its results with other methods in the literature, including the base learners. The JADE-STACK model outperforms the other models in terms of accuracy, being able to explain around, as an average of all participants, 94.50% and 67.50% (standard deviations of 1.53 and 7.44, respectively) of the data variability for one step ahead and three steps ahead, which makes it a suitable approach to dealing with nonlinear system identification. Also, the improvement over state-of-the-art methods ranges from 0.6% to 161% and 43.34% for one step ahead and three steps ahead, respectively. Therefore, the developed model can be viewed as an alternative and additional approach to well-established techniques for nonlinear system identification once it can achieve satisfactory results regarding the data variability explanation.
Collapse
Affiliation(s)
- Matheus Henrique Dal Molin Ribeiro
- Industrial and Systems Engineering Graduate Program (PPGEPS), Pontifical Catholic University of Paraná (PUCPR), R. Imaculada Conceição 1155, Curitiba 80215-901, PR, Brazil;
- Department of Mathematics, Federal University of Technology—Paraná (UTFPR), Via do Conhecimento, KM 01—Fraron, Pato Branco 85503-390, PR, Brazil
| | - Ramon Gomes da Silva
- Industrial and Systems Engineering Graduate Program (PPGEPS), Pontifical Catholic University of Paraná (PUCPR), R. Imaculada Conceição 1155, Curitiba 80215-901, PR, Brazil;
| | - José Henrique Kleinubing Larcher
- Mechanical Engineering Graduate Program (PPGEM), Pontifical Catholic University of Paraná (PUCPR), R. Imaculada Conceição 1155, Curitiba 80215-901, PR, Brazil; (J.H.K.L.); (V.C.M.)
| | - Andre Mendes
- Department of Economics, Massachusetts Institute of Technology, 292 Main St, Cambridge, MA 02142, USA;
| | - Viviana Cocco Mariani
- Mechanical Engineering Graduate Program (PPGEM), Pontifical Catholic University of Paraná (PUCPR), R. Imaculada Conceição 1155, Curitiba 80215-901, PR, Brazil; (J.H.K.L.); (V.C.M.)
- Department of Electrical Engineering, Federal University of Paraná (UFPR), R. Evaristo F. Ferreira da Costa 384, Curitiba 81530-000, PR, Brazil
| | - Leandro dos Santos Coelho
- Industrial and Systems Engineering Graduate Program (PPGEPS), Pontifical Catholic University of Paraná (PUCPR), R. Imaculada Conceição 1155, Curitiba 80215-901, PR, Brazil;
- Department of Electrical Engineering, Federal University of Paraná (UFPR), R. Evaristo F. Ferreira da Costa 384, Curitiba 81530-000, PR, Brazil
| |
Collapse
|
7
|
Zhang B, Ling L, Zeng L, Hu H, Zhang D. Multi-step prediction of carbon emissions based on a secondary decomposition framework coupled with stacking ensemble strategy. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023:10.1007/s11356-023-27109-8. [PMID: 37156950 PMCID: PMC10166696 DOI: 10.1007/s11356-023-27109-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 04/15/2023] [Indexed: 05/10/2023]
Abstract
Accurate prediction of carbon emissions is vital to achieving carbon neutrality, which is one of the major goals of the global effort to protect the ecological environment. However, due to the high complexity and volatility of carbon emission time series, it is hard to forecast carbon emissions effectively. This research offers a novel decomposition-ensemble framework for multi-step prediction of short-term carbon emissions. The proposed framework involves three main steps: (i) data decomposition. A secondary decomposition method, which is a combination of empirical wavelet transform (EWT) and variational modal decomposition (VMD), is used to process the original data. (ii) Prediction and selection: ten models are used to forecast the processed data. Then, neighborhood mutual information (NMI) is used to select suitable sub-models from candidate models. (iii) Stacking ensemble: the stacking ensemble learning method is innovatively introduced to integrate the selected sub-models and output the final prediction results. For illustration and verification, the carbon emissions of three representative EU countries are used as our sample data. The empirical results show that the proposed framework is superior to other benchmark models in predictions 1, 15, and 30 steps ahead, with the mean absolute percentage error (MAPE) of the proposed framework being as low as 5.4475% in Italy dataset, 7.3159% in France dataset, and 8.6821% in Germany dataset.
Collapse
Affiliation(s)
- Boting Zhang
- College of Mathematics and Information, South China Agricultural University, Guangzhou, 510642, China
| | - Liwen Ling
- College of Mathematics and Information, South China Agricultural University, Guangzhou, 510642, China
- Institute of Rural Revitalization Research, South China Agricultural University, Guangzhou, 510642, China
| | - Liling Zeng
- College of Mathematics and Information, South China Agricultural University, Guangzhou, 510642, China
| | - Huanling Hu
- College of Mathematics and Information, South China Agricultural University, Guangzhou, 510642, China
| | - Dabin Zhang
- College of Mathematics and Information, South China Agricultural University, Guangzhou, 510642, China.
- Institute of Rural Revitalization Research, South China Agricultural University, Guangzhou, 510642, China.
| |
Collapse
|
8
|
Yan P, Chen F, Zhao T, Zhang H, Kan X, Liu Y. Transformer fault diagnosis research based on LIF technology and IAO optimization of LightGBM. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2023; 15:261-274. [PMID: 36546319 DOI: 10.1039/d2ay01745h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Transformer fault diagnosis is a necessary operation to ensure the stable operation of a power system. In view of the problems of the low diagnostic rate and long time needed in traditional methods, such as the dissolved gas in oil method, a laser-induced fluorescence (LIF) spectral technology is proposed in this paper, which incorporated an improved aquila optimizer (IAO) and light gradient boosting machine (LightGBM), to predict the types of transformer faults. The original AO was improved using the Nelder Mead (NM) simple search method and opposition-based learning (OBL) mechanism, which could improve the parameter optimization ability of the model. Normal oil, thermal fault oil, local moisture oil, and electrical fault oil were selected as experimental samples. First, the spectral images of the four oil samples were obtained by LIF technology, and the fluorescence spectral curves obtained were preprocessed by multivariate scattering correction (MSC) and normalization (normalize), while kernel-based principle component analysis (KPCA) was used for dimensional reduction. The dimensionality-reduced data were then imported into the LightGBM model for training, and the IAO algorithm was used to optimize the parameters of the LightGBM. Finally, the experiment showed that the LIF technology demonstrated good recognition of the fault types for transformer fault diagnosis; the data purity after MSC preprocessing was higher than that of other processing methods; the prediction effect of the LightGBM model was superior to other prediction models; the LightGBM model optimized by IAO had better convergence, parameter optimization ability, and prediction accuracy than the LightGBM model optimized by the original AO and particle swarm optimization (PSO). Among the models, the MSC-IAO-LightGBM model had the best effect on fault prediction, with the mean square error (MSE) reaching 9.0643 × 10-7, mean absolute error (MAE) reaching 8.7439 × 10-4, and goodness of fit (R2) approaching 1. It can be implemented as a new diagnostic method in transformer fault detection, which is of great significance to ensure the stable and safe operation of power systems.
Collapse
Affiliation(s)
- Pengcheng Yan
- School of Electrical and Information Engineering, Anhui University of Science & Technology, Huainan 232001, China.
| | - Fengxiang Chen
- School of Electrical and Information Engineering, Anhui University of Science & Technology, Huainan 232001, China.
| | - Tianjian Zhao
- Zhuji Power Supply Company of State Grid Zhejiang Electric Power Co. Ltd, Zhuji 311800, China
| | - Heng Zhang
- School of Electrical and Information Engineering, Anhui University of Science & Technology, Huainan 232001, China.
| | - Xuyue Kan
- School of Electrical and Information Engineering, Anhui University of Science & Technology, Huainan 232001, China.
| | - Yang Liu
- School of Artificial Intelligence, Anhui University of Science & Technology, Huainan 232001, China
| |
Collapse
|
9
|
Kumar A, Misra SC, Chan FTS. Leveraging AI for advanced analytics to forecast altered tourism industry parameters: A COVID-19 motivated study. EXPERT SYSTEMS WITH APPLICATIONS 2022; 210:118628. [PMID: 36032358 PMCID: PMC9394102 DOI: 10.1016/j.eswa.2022.118628] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 08/16/2022] [Accepted: 08/16/2022] [Indexed: 06/15/2023]
Abstract
COVID-19 pandemic has given a sudden shock to economy indices worldwide and especially to the tourism sector, which is already very sensitive to such crises as natural calamities, terrorist activities, virus outbreaks and unwanted conditions. The economic implications for a reduction in tourism demand, and the need to analyse post-COVID-19 tourism motivates our research. This study aims to forecast the future trends for foreign tourist arrivals and foreign exchange earnings for India and to formulate a model to predict the future trends based on the COVID-19 parameters, vaccinations and stringency index (Government travelling guidelines). In the study, we have developed artificial intelligence models (random forest, linear regression) using the stacked based ensemble learning method for the development of base models and meta models for the study of COVID-19 and its effect on the tourism industry. The architecture of a stacking model consists of two or more base models, often referred to as level-0 models, and a meta-model that combines the predictions of the base models, and is referred to as a level-1 model (Smyth & Wolpert, 1999). The results show that the projected losses require quick action on developing new practices to sustain and complement the resilience of tourism per se.
Collapse
Key Words
-
H
x
, Level 1 regressor based on the level 0 regressor ht
- Artificial Intelligence
- COVID-19
- D, Training Data Set
- Foreign Tourist Arrivals
- N, Dataset of Labels
- RM, Input dataset of M number, where M is a finite real number
- Random forest model
- Tourism industry
- ht, base regressor of t number of training data points
- i,j, finite real numbers
- xi, x1, x2, x3…….. where xi is a input dataset
- yi, y1, y2, y3……. where yi is a output dataset label
Collapse
Affiliation(s)
- Ankur Kumar
- Industrial Management Engineering Indian Institute of Technology, Kanpur, Kanpur, Uttar Pradesh, India
| | - Subhas Chandra Misra
- Industrial Management Engineering Indian Institute of Technology, Kanpur, Kanpur, Uttar Pradesh, India
| | - Felix T S Chan
- Department of Decision Sciences, Macau University of Science and Technology, Taipa, Macao
| |
Collapse
|
10
|
Guo X, Ma J, Zubiaga A. Cluster-based deep ensemble learning for emotion classification in Internet memes. J Inf Sci 2022. [DOI: 10.1177/01655515221136241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Memes have gained popularity as a means to share visual ideas through the Internet and social media by mixing text, images and videos, often for humorous purposes. Research enabling automated analysis of memes has gained attention in recent years, including among others the task of classifying the emotion expressed in memes. In this article, we propose a novel model, cluster-based deep ensemble learning (CDEL), for emotion classification in memes. CDEL is a hybrid model that leverages the benefits of a deep learning model in combination with a clustering algorithm, which enhances the model with additional information after clustering memes with similar facial features. We evaluate the performance of CDEL on a benchmark data set for emotion classification, proving its effectiveness by outperforming a wide range of baseline models and achieving state-of-the-art performance. Further evaluation through ablated models demonstrates the effectiveness of the different components of CDEL.
Collapse
Affiliation(s)
- Xiaoyu Guo
- College of Economics and Management, Nanjing University of Aeronautics and Astronautics, China
| | - Jing Ma
- College of Economics and Management, Nanjing University of Aeronautics and Astronautics, China
| | - Arkaitz Zubiaga
- School of Electronic Engineering and Computer Science, Queen Mary University of London, UK
| |
Collapse
|
11
|
Chiu CC, Wu CM, Chien TN, Kao LJ, Li C, Jiang HL. Applying an Improved Stacking Ensemble Model to Predict the Mortality of ICU Patients with Heart Failure. J Clin Med 2022; 11:6460. [PMID: 36362686 PMCID: PMC9659015 DOI: 10.3390/jcm11216460] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 10/21/2022] [Accepted: 10/26/2022] [Indexed: 08/31/2023] Open
Abstract
Cardiovascular diseases have been identified as one of the top three causes of death worldwide, with onset and deaths mostly due to heart failure (HF). In ICU, where patients with HF are at increased risk of death and consume significant medical resources, early and accurate prediction of the time of death for patients at high risk of death would enable them to receive appropriate and timely medical care. The data for this study were obtained from the MIMIC-III database, where we collected vital signs and tests for 6699 HF patient during the first 24 h of their first ICU admission. In order to predict the mortality of HF patients in ICUs more precisely, an integrated stacking model is proposed and applied in this paper. In the first stage of dataset classification, the datasets were subjected to first-level classifiers using RF, SVC, KNN, LGBM, Bagging, and Adaboost. Then, the fusion of these six classifier decisions was used to construct and optimize the stacked set of second-level classifiers. The results indicate that our model obtained an accuracy of 95.25% and AUROC of 82.55% in predicting the mortality rate of HF patients, which demonstrates the outstanding capability and efficiency of our method. In addition, the results of this study also revealed that platelets, glucose, and blood urea nitrogen were the clinical features that had the greatest impact on model prediction. The results of this analysis not only improve the understanding of patients' conditions by healthcare professionals but allow for a more optimal use of healthcare resources.
Collapse
Affiliation(s)
- Chih-Chou Chiu
- Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan
| | - Chung-Min Wu
- Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan
| | - Te-Nien Chien
- College of Management, National Taipei University of Technology, Taipei 106, Taiwan
| | - Ling-Jing Kao
- Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan
| | - Chengcheng Li
- College of Management, National Taipei University of Technology, Taipei 106, Taiwan
| | - Han-Ling Jiang
- Alliance Manchester Business School, University of Manchester, Manchester M15 6PB, UK
| |
Collapse
|
12
|
Chen C, Wang N, Chen M, Yan XM. A framework based on heterogeneous ensemble models for liquid steel temperature prediction in LF refining process. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
13
|
Jiang F, Deng M, Tang J, Fu L, Sun H. Integrating spaceborne LiDAR and Sentinel-2 images to estimate forest aboveground biomass in Northern China. CARBON BALANCE AND MANAGEMENT 2022; 17:12. [PMID: 36048352 PMCID: PMC9438156 DOI: 10.1186/s13021-022-00212-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 08/22/2022] [Indexed: 05/29/2023]
Abstract
BACKGROUND Fast and accurate forest aboveground biomass (AGB) estimation and mapping is the basic work of forest management and ecosystem dynamic investigation, which is of great significance to evaluate forest quality, resource assessment, and carbon cycle and management. The Ice, Cloud, and Land Elevation Satellite-2 (ICESat-2), as one of the latest launched spaceborne light detection and ranging (LiDAR) sensors, can penetrate the forest canopy and has the potential to obtain accurate forest vertical structure parameters on a large scale. However, the along-track segments of canopy height provided by ICESat-2 cannot be used to obtain comprehensive AGB spatial distribution. To make up for the deficiency of spaceborne LiDAR, the Sentinel-2 images provided by google earth engine (GEE) were used as the medium to integrate with ICESat-2 for continuous AGB mapping in our study. Ensemble learning can summarize the advantages of estimation models and achieve better estimation results. A stacking algorithm consisting of four non-parametric base models which are the backpropagation (BP) neural network, k-nearest neighbor (kNN), support vector machine (SVM), and random forest (RF) was proposed for AGB modeling and estimating in Saihanba forest farm, northern China. RESULTS The results show that stacking achieved the best AGB estimation accuracy among the models, with an R2 of 0.71 and a root mean square error (RMSE) of 45.67 Mg/ha. The stacking resulted in the lowest estimation error with the decreases of RMSE by 22.6%, 27.7%, 23.4%, and 19.0% compared with those from the BP, kNN, SVM, and RF, respectively. CONCLUSION Compared with using Sentinel-2 alone, the estimation errors of all models have been significantly reduced after adding the LiDAR variables of ICESat-2 in AGB estimation. The research demonstrated that ICESat-2 has the potential to improve the accuracy of AGB estimation and provides a reference for dynamic forest resources management and monitoring.
Collapse
Affiliation(s)
- Fugen Jiang
- Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha, 410004, China
- Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha, 410004, Hunan, China
- Key Laboratory of State Forestry Administration On Forest Resources Management and Monitoring in Southern Area, Changsha, 410004, Hunan, China
| | - Muli Deng
- Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha, 410004, China
- Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha, 410004, Hunan, China
- Key Laboratory of State Forestry Administration On Forest Resources Management and Monitoring in Southern Area, Changsha, 410004, Hunan, China
| | - Jie Tang
- Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha, 410004, China
- Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha, 410004, Hunan, China
- Key Laboratory of State Forestry Administration On Forest Resources Management and Monitoring in Southern Area, Changsha, 410004, Hunan, China
| | - Liyong Fu
- Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha, 410004, China
- Research Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, Beijing, 100091, China
| | - Hua Sun
- Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha, 410004, China.
- Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha, 410004, Hunan, China.
- Key Laboratory of State Forestry Administration On Forest Resources Management and Monitoring in Southern Area, Changsha, 410004, Hunan, China.
| |
Collapse
|
14
|
Sharma G, Singh A, Jain S. DeepEvap: Deep reinforcement learning based ensemble approach for estimating reference evapotranspiration. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
15
|
Li Z, Zhang C, Liu H, Zhang C, Zhao M, Gong Q, Fu G. Developing stacking ensemble models for multivariate contamination detection in water distribution systems. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 828:154284. [PMID: 35247409 DOI: 10.1016/j.scitotenv.2022.154284] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/25/2022] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
This study presents a new stacking ensemble model for contamination event detection using multiple water quality parameters. The stacking model consists of a number of machine learning base predictors and a meta-predictor, and it is trained using cross-validation to capture different features in multiple water quality parameters and then used for water quality predictions. For each water quality parameter, the residuals between predicted and measured data are classified to identify anomalies with thresholds derived from the sequential model-based optimization method and detection probabilities updated using Bayesian analysis. Alarms derived from individual water quality parameters are fused to enhance the anomaly signals and improve the detection accuracy. The proposed stacking-based method is evaluated using a data set of six water quality parameters from a real water distribution system with randomly simulated events. The stacking-based method could detect 2496 events out of a total 2500 events without a false alarm. The results show that the stacking method outperforms an artificial neural network (ANN) benchmark method in contamination event detection. The stacking method has a higher true positive rate, lower false positive rate and higher F1 score than the ANN method. This implies that the stacking method has great promise of detecting contamination events in the water distribution system.
Collapse
Affiliation(s)
- Zilin Li
- School of Hydraulic Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Chi Zhang
- School of Hydraulic Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China.
| | - Haixing Liu
- School of Hydraulic Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Chao Zhang
- School of Hydraulic Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Mengke Zhao
- School of Hydraulic Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Qiang Gong
- Dalian Water Supply Group Co. Ltd., Dalian, Liaoning 116011, China
| | - Guangtao Fu
- Centre for Water Systems, University of Exeter, Exeter EX4 4QF, UK
| |
Collapse
|
16
|
Alqhtani SM. FLIDND-MCN: Fake label images detection of natural disasters with multi model convolutional neural network. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-213308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Disasters occur due to naturally stirring events like earthquake, floods, tsunamis, storms hurricanes, wildfire, and other geologic measures. Social media fake image posting influence is increasing day by day regarding the natural disasters. A natural disaster can result in the death or destruction of property, as well as economic damage, the severity of which is determined by the resilience of the affected population and the infrastructure available. Many researchers applied different machine learning approaches to detect and classification of natural disaster types, but these algorithms fail to identify fake labelling occurs on disaster events images. Furthermore, when many natural disaster events occur at a time then these systems couldn’t handle the classification process and fake labelling of images. Therefore, to tackle this problem I have proposed a FLIDND-MCN: Fake Label Image Detection of Natural Disaster types with Multi Model Convolutional Neural Network for multi-phormic natural disastrous events. The main purpose of this model is to provide accurate information regarding the multi-phormic natural disastrous events for emergency response decision making for a particular disaster. The proposed approach consists of multi models’ convolutional neural network (MMCNN) architecture. The dataset used for this purpose is publicly available and consists of 4,428 images of different natural disaster events. The evaluation of proposed model is measured in the terms of different statistical values such as sensitivity, specificity, accuracy, precision, and f1-score. The proposed model shows the accuracy value of 0.93 percent for fake label disastrous images detection which is higher as compared to the already proposed state-of-the-art models.
Collapse
Affiliation(s)
- Samar M. Alqhtani
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
| |
Collapse
|
17
|
Ariza-Colpas PP, Vicario E, Oviedo-Carrascal AI, Butt Aziz S, Piñeres-Melo MA, Quintero-Linero A, Patara F. Human Activity Recognition Data Analysis: History, Evolutions, and New Trends. SENSORS 2022; 22:s22093401. [PMID: 35591091 PMCID: PMC9103712 DOI: 10.3390/s22093401] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 03/31/2022] [Accepted: 04/04/2022] [Indexed: 01/23/2023]
Abstract
The Assisted Living Environments Research Area–AAL (Ambient Assisted Living), focuses on generating innovative technology, products, and services to assist, medical care and rehabilitation to older adults, to increase the time in which these people can live. independently, whether they suffer from neurodegenerative diseases or some disability. This important area is responsible for the development of activity recognition systems—ARS (Activity Recognition Systems), which is a valuable tool when it comes to identifying the type of activity carried out by older adults, to provide them with assistance. that allows you to carry out your daily activities with complete normality. This article aims to show the review of the literature and the evolution of the different techniques for processing this type of data from supervised, unsupervised, ensembled learning, deep learning, reinforcement learning, transfer learning, and metaheuristics approach applied to this sector of science. health, showing the metrics of recent experiments for researchers in this area of knowledge. As a result of this article, it can be identified that models based on reinforcement or transfer learning constitute a good line of work for the processing and analysis of human recognition activities.
Collapse
Affiliation(s)
- Paola Patricia Ariza-Colpas
- Department of Computer Science and Electronics, Universidad de la Costa CUC, Barranquilla 080002, Colombia
- Faculty of Engineering in Information and Communication Technologies, Universidad Pontificia Bolivariana, Medellín 050031, Colombia;
- Correspondence:
| | - Enrico Vicario
- Department of Information Engineering, University of Florence, 50139 Firenze, Italy; (E.V.); (F.P.)
| | - Ana Isabel Oviedo-Carrascal
- Faculty of Engineering in Information and Communication Technologies, Universidad Pontificia Bolivariana, Medellín 050031, Colombia;
| | - Shariq Butt Aziz
- Department of Computer Science and IT, University of Lahore, Lahore 44000, Pakistan;
| | | | | | - Fulvio Patara
- Department of Information Engineering, University of Florence, 50139 Firenze, Italy; (E.V.); (F.P.)
| |
Collapse
|
18
|
Cui S, Qiu H, Wang S, Wang Y. Two-stage stacking heterogeneous ensemble learning method for gasoline octane number loss prediction. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107989] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
19
|
Cui S, Wang Y, Wang D, Sai Q, Huang Z, Cheng TCE. A two-layer nested heterogeneous ensemble learning predictive method for COVID-19 mortality. Appl Soft Comput 2021; 113:107946. [PMID: 34646110 PMCID: PMC8494501 DOI: 10.1016/j.asoc.2021.107946] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 07/05/2021] [Accepted: 09/22/2021] [Indexed: 12/12/2022]
Abstract
The COVID-19 epidemic has had a great adverse impact on the world, having taken a heavy toll, killing hundreds of thousands of people. In order to help the world better combat COVID-19 and reduce its death toll, this study focuses on the COVID-19 mortality. First, using the multiple stepwise regression analysis method, the factors from eight aspects (economy, society, climate etc.) that may affect the mortality rates of COVID-19 in various countries is examined. In addition, a two-layer nested heterogeneous ensemble learning-based prediction method that combines linear regression (LR), support vector machine (SVM), and extreme learning machine (ELM) is developed to predict the development trends of COVID-19 mortality in various countries. Based on data from 79 countries, the experiment proves that age structure (proportion of the population over 70 years old) and medical resources (number of beds) are the main factors affecting the mortality of COVID-19 in each country. In addition, it is found that the number of nucleic acid tests and climatic factors are correlated with COVID-19 mortality. At the same time, when predicting COVID-19 mortality, the proposed heterogeneous ensemble learning-based prediction method shows better prediction ability than state-of-the-art machine learning methods such as LR, SVM, ELM, random forest (RF), long short-term memory (LSTM) etc.
Collapse
Affiliation(s)
- Shaoze Cui
- School of Economics and Management, Dalian University of Technology, Dalian 116023, China
| | - Yanzhang Wang
- School of Economics and Management, Dalian University of Technology, Dalian 116023, China
| | - Dujuan Wang
- Business School, Sichuan University, Chengdu 610064, China
| | - Qian Sai
- School of Economics and Management, Dalian University of Technology, Dalian 116023, China
| | - Ziheng Huang
- Business School, Sichuan University, Chengdu 610064, China
| | - T C E Cheng
- Department of Logistics and Maritime Studies, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
| |
Collapse
|
20
|
Intelligent Decision Support System for Predicting Student’s E-Learning Performance Using Ensemble Machine Learning. MATHEMATICS 2021. [DOI: 10.3390/math9172078] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Electronic learning management systems provide live environments for students and faculty members to connect with their institutional online portals and perform educational activities virtually. Although modern technologies proactively support these online sessions, students’ active participation remains a challenge that has been discussed in previous research. Additionally, one concern for both parents and teachers is how to accurately measure student performance using different attributes collected during online sessions. Therefore, the research idea undertaken in this study is to understand and predict the performance of the students based on features extracted from electronic learning management systems. The dataset chosen in this study belongs to one of the learning management systems providing a number of features predicting student’s performance. The integrated machine learning model proposed in this research can be useful to make proactive and intelligent decisions according to student performance evaluated through the electronic system’s data. The proposed model consists of five traditional machine learning algorithms, which are further enhanced by applying four ensemble techniques: bagging, boosting, stacking, and voting. The overall F1 scores of the single models are as follows: DT (0.675), RF (0.777), GBT (0.714), NB (0.654), and KNN (0.664). The model performance has shown remarkable improvement using ensemble approaches. The stacking model by combining all five classifiers has outperformed and recorded the highest F1 score (0.8195) among other ensemble methods. The integration of the ML models has improved the prediction ratio and performed better than all other ensemble approaches. The proposed model can be useful for predicting student performance and helping educators to make informed decisions by proactively notifying the students.
Collapse
|
21
|
Liang P, Fu Y, Gao K, Sun H. An enhanced group teaching optimization algorithm for multi-product disassembly line balancing problems. COMPLEX INTELL SYST 2021. [DOI: 10.1007/s40747-021-00478-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
AbstractBig data have been widely studied by numerous scholars and enterprises due to its great power in making highly reliable decisions for various complex systems. Remanufacturing systems have recently received much attention, because they play significant roles in end-of-life product recovery, environment protection and resource conservation. Disassembly is treated as a critical step in remanufacturing systems. In practice, it is difficult to know the accurate data of end-of-life products such as disassembly time because of their various usage processes, leading to the great difficulty of making effective and reliable decisions. Thus, it is necessary to model the disassembly process with stochastic programming method where the past collected data are fitted into stochastic distributions of parameters by applying big data technology. Additionally, designing and applying highly efficient intelligent optimization algorithms to handle a variety of complex problems in the disassembly process are urgently needed. To achieve the global optimization of disassembling multiple products simultaneously, this work studies a stochastic multi-product disassembly line balancing problem with maximal disassembly profit while meeting disassembly time requirements. Moreover, a chance-constrained programming model is correspondingly formulated, and then, an enhanced group teaching optimization algorithm incorporating a stochastic simulation method is developed by considering this model’s features. Via performing simulation experiments on real-life cases and comparing it with five popularly known approaches, we verify the excellent performance of the designed method in solving the studied problem.
Collapse
|
22
|
Cui S, Wang Y, Yin Y, Cheng T, Wang D, Zhai M. A cluster-based intelligence ensemble learning method for classification problems. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.01.061] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
23
|
Hsu MF, Lin SJ. A BSC-based network DEA model equipped with computational linguistics for performance assessment and improvement. INT J MACH LEARN CYB 2021. [DOI: 10.1007/s13042-021-01331-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
24
|
|