1
|
Adaptively Promoting Diversity in a Novel Ensemble Method for Imbalanced Credit-Risk Evaluation. MATHEMATICS 2022. [DOI: 10.3390/math10111790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Ensemble learning techniques are widely applied to classification tasks such as credit-risk evaluation. As for most credit-risk evaluation scenarios in the real world, only imbalanced data are available for model construction, and the performance of ensemble models still needs to be improved. An ideal ensemble algorithm is supposed to improve diversity in an effective manner. Therefore, we provide an insight in considering an ensemble diversity-promotion method for imbalanced learning tasks. A novel ensemble structure is proposed, which combines self-adaptive optimization techniques and a diversity-promotion method (SA-DP Forest). Additional artificially constructed samples, generated by a fuzzy sampling method at each iteration, directly create diverse hypotheses and address the imbalanced classification problem while training the proposed model. Meanwhile, the self-adaptive optimization mechanism within the ensemble simultaneously balances the individual accuracy as the diversity increases. The results using the decision tree as a base classifier indicate that SA-DP Forest outperforms the comparative algorithms, as reflected by most evaluation metrics on three credit data sets and seven other imbalanced data sets. Our method is also more suitable for experimental data that are properly constructed with a series of artificial imbalance ratios on the original credit data set.
Collapse
|
2
|
Abstract
SAR image registration is a crucial problem in SAR image processing since the registration results with high precision are conducive to improving the quality of other problems, such as change detection of SAR images. Recently, for most DL-based SAR image registration methods, the problem of SAR image registration has been regarded as a binary classification problem with matching and non-matching categories to construct the training model, where a fixed scale is generally set to capture pair image blocks corresponding to key points to generate the training set, whereas it is known that image blocks with different scales contain different information, which affects the performance of registration. Moreover, the number of key points is not enough to generate a mass of class-balance training samples. Hence, we proposed a new method of SAR image registration that meanwhile utilizes the information of multiple scales to construct the matching models. Specifically, considering that the number of training samples is small, deep forest was employed to train multiple matching models. Moreover, a multi-scale fusion strategy is proposed to integrate the multiple predictions and obtain the best pair matching points between the reference image and the sensed image. Finally, experimental results on four datasets illustrate that the proposed method is better than the compared state-of-the-art methods, and the analyses for different scales also indicate that the fusion of multiple scales is more effective and more robust for SAR image registration than one single fixed scale.
Collapse
|
3
|
Wu Y, Hu H, Cai J, Chen R, Zuo X, Cheng H, Yan D. Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults. Front Public Health 2021; 9:626331. [PMID: 34268283 PMCID: PMC8275929 DOI: 10.3389/fpubh.2021.626331] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 05/21/2021] [Indexed: 02/05/2023] Open
Abstract
Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most important variables from candidate variables. And we further established a stepwise model based on the predictors chosen by the XGBoost model. The area under the receiver operating characteristic curve (AUC), decision curve and calibration analysis were used to assess discrimination, clinical use and calibration of the model, respectively. The external validation was performed on a cohort of 11,113 Japanese participants. Result: In the training and validation sets, 148 and 145 incident diabetes cases occurred. XGBoost methods selected the 10 most important variables from 15 candidate variables. Fasting plasma glucose (FPG), body mass index (BMI) and age were the top 3 important variables. And we further established a stepwise model and a prediction nomogram. The AUCs of the stepwise model were 0.933 and 0.910 in the training and validation sets, respectively. The Hosmer-Lemeshow test showed a perfect fit between the predicted diabetes risk and the observed diabetes risk (p = 0.068 for the training set, p = 0.165 for the validation set). Decision curve analysis presented the clinical use of the stepwise model and there was a wide range of alternative threshold probability spectrum. And there were almost no the interactions between these predictors (most P-values for interaction >0.05). Furthermore, the AUC for the external validation set was 0.830, and the Hosmer-Lemeshow test for the external validation set showed no statistically significant difference between the predicted diabetes risk and observed diabetes risk (P = 0.824). Conclusion: We established and validated a risk assessment system for characterizing the 3-year risk of incident diabetes.
Collapse
Affiliation(s)
- Yang Wu
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shenzhen University Health Science Center, Shenzhen, China
| | - Haofei Hu
- Shenzhen University Health Science Center, Shenzhen, China
- Department of Nephrology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Nephrology, Shenzhen Second People's Hospital, Shenzhen, China
| | - Jinlin Cai
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shantou University Medical College, Shantou, China
| | - Runtian Chen
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shenzhen University Health Science Center, Shenzhen, China
| | - Xin Zuo
- Department of Endocrinology, The Third People's Hospital of Shenzhen, Shenzhen, China
| | - Heng Cheng
- Department of Endocrinology, The Third People's Hospital of Shenzhen, Shenzhen, China
| | - Dewen Yan
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shenzhen University Health Science Center, Shenzhen, China
- *Correspondence: Dewen Yan
| |
Collapse
|