1
|
Fei Z, Liang S, Cai Y, Shen Y. Ensemble Machine-Learning-Based Prediction Models for the Compressive Strength of Recycled Powder Mortar. MATERIALS (BASEL, SWITZERLAND) 2023; 16:583. [PMID: 36676320 PMCID: PMC9862350 DOI: 10.3390/ma16020583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/27/2022] [Accepted: 01/04/2023] [Indexed: 06/17/2023]
Abstract
Recycled powder (RP) serves as a potential and prospective substitute for cementitious materials in concrete. The compressive strength of RP mortar is a pivotal factor affecting the mechanical properties of RP concrete. The application of machine learning (ML) approaches in the engineering problems, particularly for predicting the mechanical properties of construction materials, leads to high prediction accuracy and low experimental costs. In this study, 204 groups of RP mortar compression experimental data are collected from the literature to establish a dataset for ML, including 163 groups in the training set and 41 groups in the test set. Four ensemble ML models, namely eXtreme Gradient-Boosting (XGBoost), Random Forest (RF), Light Gradient-Boosting Machine (LightGBM) and Adaptive Boosting (AdaBoost), were selected to predict the compressive strength of RP mortar. The comparative results demonstrate that XGBoost has the highest prediction accuracy when the a10-index, MAE, RMSE and R2 of the training set are 0.926, 1.596, 2.155 and 0.950 and the a10-index, MAE, RMSE and R2 of the test set are 0.659, 3.182, 4.285 and 0.842, respectively. SHapley Additive exPlanation (SHAP) is adopted to interpret the prediction process of XGBoost and explain the influence of influencing factors on the compressive strength of RP mortar. According to the importance of influencing factors, the order is the mass replacement rate of RP, the size of RP, the kind of RP and the water binder ratio of RP. The compressive strength of RP mortar decreases with the increase in the RP mass replacement rate. The compressive strength of RBP mortar is slightly higher than that of RCP mortar. Machine learning technologies will benefit the construction industry by facilitating the rapid and cost-effective evaluation of RP material properties.
Collapse
|
2
|
Li Z. A Feature Selection Method Using Dynamic Dependency and Redundancy Analysis. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-06590-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
3
|
Zhang Y, Ma Y, Yang X. Multi-label feature selection based on logistic regression and manifold learning. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03008-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
4
|
Sun J, Yu H, Zhong G, Dong J, Zhang S, Yu H. Random Shapley Forests: Cooperative Game-Based Random Forests With Consistency. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:205-214. [PMID: 32203041 DOI: 10.1109/tcyb.2020.2972956] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The original random forests (RFs) algorithm has been widely used and has achieved excellent performance for the classification and regression tasks. However, the research on the theory of RFs lags far behind its applications. In this article, to narrow the gap between the applications and the theory of RFs, we propose a new RFs algorithm, called random Shapley forests (RSFs), based on the Shapley value. The Shapley value is one of the well-known solutions in the cooperative game, which can fairly assess the power of each player in a game. In the construction of RSFs, RSFs use the Shapley value to evaluate the importance of each feature at each tree node by computing the dependency among the possible feature coalitions. In particular, inspired by the existing consistency theory, we have proved the consistency of the proposed RFs algorithm. Moreover, to verify the effectiveness of the proposed algorithm, experiments on eight UCI benchmark datasets and four real-world datasets have been conducted. The results show that RSFs perform better than or at least comparable with the existing consistent RFs, the original RFs, and a classic classifier, support vector machines.
Collapse
|
5
|
Huang C, Mi X, Kang B. Basic probability assignment to probability distribution function based on the Shapley value approach. INT J INTELL SYST 2021. [DOI: 10.1002/int.22456] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Chongru Huang
- College of Information Engineering Northwest A&F University Yangling Shaanxi China
| | - Xiangjun Mi
- College of Information Engineering Northwest A&F University Yangling Shaanxi China
| | - Bingyi Kang
- College of Information Engineering Northwest A&F University Yangling Shaanxi China
- Key Laboratory of Agricultural Internet of Things Ministry of Agriculture and Rural Affairs Yangling Shaanxi China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service Yangling Shaanxi China
| |
Collapse
|
6
|
Li Y, Cheng Y. Streaming Feature Selection for Multi-Label Data with Dynamic Sliding Windows and Feature Repulsion Loss. ENTROPY 2019. [PMCID: PMC7514496 DOI: 10.3390/e21121151] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In recent years, there has been a growing interest in the problem of multi-label streaming feature selection with no prior knowledge of the feature space. However, the algorithms proposed to handle this problem seldom consider the group structure of streaming features. Another shortcoming arises from the fact that few studies have addressed atomic feature models, and particularly, few have measured the attraction and repulsion between features. To remedy these shortcomings, we develop the streaming feature selection algorithm with dynamic sliding windows and feature repulsion loss (SF-DSW-FRL). This algorithm is essentially carried out in three consecutive steps. Firstly, within dynamic sliding windows, candidate streaming features that are strongly related to the labels in different feature groups are selected and stored in a fixed sliding window. Then, the interaction between features is measured by a loss function inspired by the mutual repulsion and attraction between atoms in physics. Specifically, one feature attraction term and two feature repulsion terms are constructed and combined to create the feature repulsion loss function. Finally, for the fixed sliding window, the best feature subset is selected according to this loss function. The effectiveness of the proposed algorithm is demonstrated through experiments on several multi-label datasets, statistical hypothesis testing, and stability analysis.
Collapse
Affiliation(s)
- Yu Li
- School of Computer and Information, Anqing Normal University, Anqing 246003, China;
- Lab of Multimedia and Recommendation Systems, Hefei University of Technology, Hefei 230009, China
| | - Yusheng Cheng
- School of Computer and Information, Anqing Normal University, Anqing 246003, China;
- The University Key Laboratory of Intelligent Perception and Computing of Anhui Province, Anqing 246003, China
- Correspondence:
| |
Collapse
|
7
|
Dense Model for Automatic Image Description Generation with Game Theoretic Optimization. INFORMATION 2019. [DOI: 10.3390/info10110354] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Due to the rapid growth of deep learning technologies, automatic image description generation is an interesting problem in computer vision and natural language generation. It helps to improve access to photo collections on social media and gives guidance for visually impaired people. Currently, deep neural networks play a vital role in computer vision and natural language processing tasks. The main objective of the work is to generate the grammatically correct description of the image using the semantics of the trained captions. An encoder-decoder framework using the deep neural system is used to implement an image description generation task. The encoder is an image parsing module, and the decoder is a surface realization module. The framework uses Densely connected convolutional neural networks (Densenet) for image encoding and Bidirectional Long Short Term Memory (BLSTM) for language modeling, and the outputs are given to bidirectional LSTM in the caption generator, which is trained to optimize the log-likelihood of the target description of the image. Most of the existing image captioning works use RNN and LSTM for language modeling. RNNs are computationally expensive with limited memory. LSTM checks the inputs in one direction. BLSTM is used in practice, which avoids the problem of RNN and LSTM. In this work, the selection of the best combination of words in caption generation is made using beam search and game theoretic search. The results show the game theoretic search outperforms beam search. The model was evaluated with the standard benchmark dataset Flickr8k. The Bilingual Evaluation Understudy (BLEU) score is taken as the evaluation measure of the system. A new evaluation measure called GCorrectwas used to check the grammatical correctness of the description. The performance of the proposed model achieves greater improvements over previous methods on the Flickr8k dataset. The proposed model produces grammatically correct sentences for images with a GCorrect of 0.040625 and a BLEU score of 69.96%
Collapse
|
8
|
Valmarska A, Miljkovic D, Konitsiotis S, Gatsios D, Lavrač N, Robnik-Šikonja M. Symptoms and medications change patterns for Parkinson's disease patients stratification. Artif Intell Med 2018; 91:82-95. [PMID: 29803610 DOI: 10.1016/j.artmed.2018.04.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 04/26/2018] [Accepted: 04/30/2018] [Indexed: 12/26/2022]
Abstract
Quality of life of patients with Parkinson's disease degrades significantly with disease progression. This paper presents a step towards personalized management of Parkinson's disease patients, based on discovering groups of similar patients. Similarity is based on patients' medical conditions and changes in the prescribed therapy when the medical conditions change. We present two novel approaches. The first algorithm discovers symptoms' impact on Parkinson's disease progression. Experiments on the Parkinson Progression Markers Initiative (PPMI) data reveal a subset of symptoms influencing disease progression which are already established in Parkinson's disease literature, as well as symptoms that are considered only recently as possible indicators of disease progression by clinicians. The second novelty is a methodology for detecting patterns of medications dosage changes based on the patient status. The methodology combines multitask learning using predictive clustering trees and short time series analysis to better understand when a change in medications is required. The experiments on PPMI data demonstrate that, using the proposed methodology, we can identify some clinically confirmed patients' symptoms suggesting medications change. In terms of predictive performance, our multitask predictive clustering tree approach is mostly comparable to the random forest multitask model, but has the advantage of model interpretability.
Collapse
Affiliation(s)
- Anita Valmarska
- Jožef Stefan Institute, Jamova 39, Ljubljana, Slovenia; Jožef Stefan International Postgraduate School, Jamova 39, Ljubljana, Slovenia.
| | | | - Spiros Konitsiotis
- University of Ioannina, Medical School, Department of Neurology, Ioannina, Greece.
| | - Dimitris Gatsios
- University of Ioannina, Department of Biomedical Research, Ioannina, Greece.
| | - Nada Lavrač
- Jožef Stefan Institute, Jamova 39, Ljubljana, Slovenia; Jožef Stefan International Postgraduate School, Jamova 39, Ljubljana, Slovenia.
| | | |
Collapse
|
9
|
Adaptive feature selection using v-shaped binary particle swarm optimization. PLoS One 2017; 12:e0173907. [PMID: 28358850 PMCID: PMC5373580 DOI: 10.1371/journal.pone.0173907] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 02/28/2017] [Indexed: 12/02/2022] Open
Abstract
Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.
Collapse
|
10
|
Barchinezhad S, Eftekhari M. Unsupervised feature selection method based on sensitivity and correlation concepts for multiclass problems. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2016. [DOI: 10.3233/ifs-151736] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Soheila Barchinezhad
- Department of Electronic and Computer, Kerman Graduate University of Advanced Technology, Haft Bagh Blvd, Mahan, Kerman, Iran
| | - Mahdi Eftekhari
- Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
| |
Collapse
|
11
|
Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information. Adv Bioinformatics 2016; 2016:1058305. [PMID: 27127506 PMCID: PMC4818815 DOI: 10.1155/2016/1058305] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2015] [Revised: 02/20/2016] [Accepted: 02/22/2016] [Indexed: 11/17/2022] Open
Abstract
High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI) for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches.
Collapse
|
12
|
|
13
|
A cooperative expert based support vector regression (Co-ESVR) system to determine collar dimensions around bridge pier. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2014.03.024] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
14
|
Zeng K, She K, Niu X. Feature selection with neighborhood entropy-based cooperative game theory. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2014; 2014:479289. [PMID: 25276120 PMCID: PMC4158261 DOI: 10.1155/2014/479289] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Revised: 07/27/2014] [Accepted: 08/10/2014] [Indexed: 11/18/2022]
Abstract
Feature selection plays an important role in machine learning and data mining. In recent years, various feature measurements have been proposed to select significant features from high-dimensional datasets. However, most traditional feature selection methods will ignore some features which have strong classification ability as a group but are weak as individuals. To deal with this problem, we redefine the redundancy, interdependence, and independence of features by using neighborhood entropy. Then the neighborhood entropy-based feature contribution is proposed under the framework of cooperative game. The evaluative criteria of features can be formalized as the product of contribution and other classical feature measures. Finally, the proposed method is tested on several UCI datasets. The results show that neighborhood entropy-based cooperative game theory model (NECGT) yield better performance than classical ones.
Collapse
Affiliation(s)
- Kai Zeng
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Kun She
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xinzheng Niu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
15
|
Sun X, Liu Y, Xu M, Chen H, Han J, Wang K. Feature selection using dynamic weights for classification. Knowl Based Syst 2013. [DOI: 10.1016/j.knosys.2012.10.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|