1
|
Lu B, Meng X, Dong S, Zhang Z, Liu C, Jiang J, Herrmann H, Li X. High-resolution mapping of regional VOCs using the enhanced space-time extreme gradient boosting machine (XGBoost) in Shanghai. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 905:167054. [PMID: 37714357 DOI: 10.1016/j.scitotenv.2023.167054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/10/2023] [Accepted: 09/11/2023] [Indexed: 09/17/2023]
Abstract
The accurate estimation of highly spatiotemporal volatile organic compounds (VOCs) is of great significance to establish advanced early warning systems and regulate air pollution control. However, the estimation of high spatiotemporal VOCs remains incomplete. Here, the space-time extreme gradient boost model (STXGB) was enhanced by integrating spatiotemporal information to obtain the spatial resolution and overall accuracy of VOCs. To this end, meteorological, topographical and pollutant emissions, was input to the STXGB model, and regional hourly 300 m VOCs maps for 2020 in Shanghai were produced. Our results show that the STXGB model achieve good hourly VOCs estimations performance (R2 = 0.73). A further analysis of SHapley Additive exPlanation (SHAP) regression indicate that local interpretations of the STXGB models demonstrate the strong contribution of emissions on mapping VOCs estimations, while acknowledging the important contribution of space and time term. The proposed approach outperforms many traditional machine learning models with a lower computational burden in terms of speed and memory.
Collapse
Affiliation(s)
- Bingqing Lu
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China
| | - Xue Meng
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China
| | - Shanshan Dong
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China
| | - Zekun Zhang
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China
| | - Chao Liu
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China
| | - Jiakui Jiang
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China
| | - Hartmut Herrmann
- Leibniz-Institut für Troposphärenforschung (IfT), Permoserstr. 15, 04318 Leipzig, Germany
| | - Xiang Li
- Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, PR China; Institute of Eco-Chongming (IEC), Shanghai 200241, China.
| |
Collapse
|
2
|
Wang M, Kang J, Liu W, Su J, Li M. Research on prediction of compressive strength of fly ash and slag mixed concrete based on machine learning. PLoS One 2022; 17:e0279293. [PMID: 36574382 PMCID: PMC9794082 DOI: 10.1371/journal.pone.0279293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 12/04/2022] [Indexed: 12/28/2022] Open
Abstract
Every year, a large amount of solid waste such as fly ash and slag is generated worldwide. If these solid wastes are used in concrete mixes to make concrete, it can effectively save resources and protect the environment. The compressive strength of concrete is an essential indicator for testing its quality, and its prediction is affected by many factors. It is difficult to predict its strength accurately. Therefore, based on the current popular machine learning supervised learning algorithms: Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Support Vector Machine (SVR), three models established a nonlinear mapping between multi-factor features and target feature concrete compressive strength. Using the three completed training models, we validated the test set with 206 example sets, and the Root Mean Square Error (RMSE), fitting coefficient (R2), and Mean Absolute Error (MAE) were used as evaluation metrics. The validation results showed that the values of RMSE, R2, and MAE for the RF model were 0.1, 0.9, and 0.21, respectively; the values of XGBoost model were 0.05, 0.95, and 0.15, respectively. The values of SVR were 0.15, 0.86, and 0.3, respectively. As a result, Extreme Gradient Boosting (XGBoost) has better generalization ability and prediction accuracy than the other two algorithms.
Collapse
Affiliation(s)
- Meng Wang
- College of Mining Engineering, Liaoning Technical University, Fuxin, China
| | - Jiaxu Kang
- College of Mining Engineering, Liaoning Technical University, Fuxin, China
- * E-mail:
| | - Weiwei Liu
- College of Mining Engineering, Liaoning Technical University, Fuxin, China
| | - Jinshuai Su
- College of Mining Engineering, Liaoning Technical University, Fuxin, China
| | - Meng Li
- College of Mining Engineering, Liaoning Technical University, Fuxin, China
| |
Collapse
|
3
|
Short- and Medium-Term Power Demand Forecasting with Multiple Factors Based on Multi-Model Fusion. MATHEMATICS 2022. [DOI: 10.3390/math10122148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With the continuous development of economy and society, power demand forecasting has become an important task of the power industry. Accurate power demand forecasting can promote the operation and development of the power supply industry. However, since power consumption is affected by a number of factors, it is difficult to accurately predict the power demand data. With the accumulation of data in the power industry, machine learning technology has shown great potential in power demand forecasting. In this study, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM) are integrated by stacking to build an XLG-LR fusion model to predict power demand. Firstly, preprocessing was carried out on 13 months of electricity and meteorological data. Next, the hyperparameters of each model were adjusted and optimized. Secondly, based on the optimal hyperparameter configuration, a prediction model was built using the training set (70% of the data). Finally, the test set (30% of the data) was used to evaluate the performance of each model. Mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and goodness-of-fit coefficient (R^2) were utilized to analyze each model at different lengths of time, including their seasonal, weekly, and monthly forecast effect. Furthermore, the proposed fusion model was compared with other neural network models such as the GRU, LSTM and TCN models. The results showed that the XLG-LR model achieved the best prediction results at different time lengths, and at the same time consumed the least time compared to the neural network model. This method can provide a more reliable reference for the operation and dispatch of power enterprises and future power construction and planning.
Collapse
|
4
|
Modelling Soil Temperature by Tree-Based Machine Learning Methods in Different Climatic Regions of China. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12105088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Accurate estimation of soil temperature (Ts) at a national scale under different climatic conditions is important for soil–plant–atmosphere interactions. This study estimated daily Ts at the 0 cm depth for 689 meteorological stations in seven different climate zones of China for the period 1966–2015 with the M5P model tree (M5P), random forests (RF), and the extreme gradient boosting (XGBoost). The results showed that the XGBoost model (averaged coefficient of determination (R2) = 0.964 and root mean square error (RMSE) = 2.066 °C) overall performed better than the RF (averaged R2 = 0.959 and RMSE = 2.130 °C) and M5P (averaged R2 = 0.954 and RMSE = 2.280 °C) models for estimating Ts with higher computational efficiency. With the combination of mean air temperature (Tmean) and global solar radiation (Rs) as inputs, the estimating accuracy of the models was considerably high (averaged R2 = 0.96–0.97 and RMSE = 1.73–1.99 °C). On the basis of Tmean, adding Rs to the model input had a greater degree of influence on model estimating accuracy than adding other climatic factors to the input. Principal component analysis indicated that soil organic matter, soil water content, Tmean, relative humidity (RH), Rs, and wind speed (U2) are the main factors that cause errors in estimating Ts, and the total error interpretation rate was 97.9%. Overall, XGBoost would be a suitable algorithm for estimating Ts in different climate zones of China, and the combination of Tmean and Rs as model inputs would be more practical than other input combinations.
Collapse
|
5
|
An Improved Sea Ice Classification Algorithm with Gaofen-3 Dual-Polarization SAR Data Based on Deep Convolutional Neural Networks. REMOTE SENSING 2022. [DOI: 10.3390/rs14040906] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The distribution of sea ice is one of the major safety hazards for sea navigation. As human activities in polar regions become more frequent, monitoring and forecasting of sea ice are of great significance. In this paper, we use SAR data from the C-band synthetic aperture radar (SAR) Gaofen-3 satellite in the dual-polarization (VV, VH) fine strip II (FSII) mode of operation to study the Arctic sea ice classification in winter. SAR data we use were taken in the western Arctic Ocean from January to February 2020. We classify the sea ice into four categories, namely new ice (NI), thin first-year ice (tI), thick first-year ice (TI), and old ice (OI), by referring to the ice maps provided by the Canadian Ice Service (CIS). Then, we use the deep learning model MobileNetV3 as the backbone network, input samples of different sizes, and combine the backbone network with multiscale feature fusion methods to build a deep learning model called Multiscale MobileNet (MSMN). Dual-polarization SAR data are used to synthesize pseudocolor images and produce samples of sizes 16 × 16 × 3, 32 × 32 × 3, and 64 × 64 × 3 as input. Ultimately, MSMN can reach over 95% classification accuracy on testing SAR sea ice images. The classification results using only VV polarization or VH polarization data are tested, and it is found that using dual-polarization data could improve the classification accuracy by 10.05% and 9.35%, respectively. When other classification models are trained using the training data from this paper for comparison, the accuracy of MSMN is 4.86% and 1.84% higher on average than that of the model built using convolutional neural networks (CNNs) and ResNet18 model, respectively.
Collapse
|
6
|
Synergetic Classification of Coastal Wetlands over the Yellow River Delta with GF-3 Full-Polarization SAR and Zhuhai-1 OHS Hyperspectral Remote Sensing. REMOTE SENSING 2021. [DOI: 10.3390/rs13214444] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The spatial distribution of coastal wetlands affects their ecological functions. Wetland classification is a challenging task for remote sensing research due to the similarity of different wetlands. In this study, a synergetic classification method developed by fusing the 10 m Zhuhai-1 Constellation Orbita Hyperspectral Satellite (OHS) imagery with 8 m C-band Gaofen-3 (GF-3) full-polarization Synthetic Aperture Radar (SAR) imagery was proposed to offer an updated and reliable quantitative description of the spatial distribution for the entire Yellow River Delta coastal wetlands. Three classical machine learning algorithms, namely, the maximum likelihood (ML), Mahalanobis distance (MD), and support vector machine (SVM), were used for the synergetic classification of 18 spectral, index, polarization, and texture features. The results showed that the overall synergetic classification accuracy of 97% is significantly higher than that of single GF-3 or OHS classification, proving the performance of the fusion of full-polarization SAR data and hyperspectral data in wetland mapping. The synergy of polarimetric SAR (PolSAR) and hyperspectral imagery enables high-resolution classification of wetlands by capturing images throughout the year, regardless of cloud cover. The proposed method has the potential to provide wetland classification results with high accuracy and better temporal resolution in different regions. Detailed and reliable wetland classification results would provide important wetlands information for better understanding the habitat area of species, migration corridors, and the habitat change caused by natural and anthropogenic disturbances.
Collapse
|
7
|
Deng S, Huang X, Qin Z, Fu Z, Yang T. A novel hybrid method for direction forecasting and trading of Apple Futures. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107734] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
Zhu L, Ma X, Wu P, Xu J. Multiple Classifiers Based Semi-Supervised Polarimetric SAR Image Classification Method. SENSORS 2021; 21:s21093006. [PMID: 33922957 PMCID: PMC8123318 DOI: 10.3390/s21093006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/14/2021] [Accepted: 04/16/2021] [Indexed: 11/25/2022]
Abstract
Polarimetric synthetic aperture radar (PolSAR) image classification has played an important role in PolSAR data application. Deep learning has achieved great success in PolSAR image classification over the past years. However, when the labeled training dataset is insufficient, the classification results are usually unsatisfactory. Furthermore, the deep learning approach is based on hierarchical features, which is an approach that cannot take full advantage of the scattering characteristics in PolSAR data. Hence, it is worthwhile to make full use of scattering characteristics to obtain a high classification accuracy based on limited labeled samples. In this paper, we propose a novel semi-supervised classification method for PolSAR images, which combines the deep learning technique with the traditional scattering trait-based classifiers. Firstly, based on only a small number of training samples, the classification results of the Wishart classifier, support vector machine (SVM) classifier, and a complex-valued convolutional neural network (CV-CNN) are used to conduct majority voting, thus generating a strong dataset and a weak dataset. The strong training set are then used as pseudo-labels to reclassify the weak dataset by CV-CNN. The final classification results are obtained by combining the strong training set and the reclassification results. Experiments on two real PolSAR images on agricultural and forest areas indicate that, in most cases, significant improvements can be achieved with the proposed method, compared to the base classifiers, and the improvement is approximately 3–5%. When the number of labeled samples was small, the superiority of the proposed method is even more apparent. The improvement for built-up areas or infrastructure objects is not as significant as forests.
Collapse
Affiliation(s)
- Lekun Zhu
- School of Resources and Environmental Engineering/Anhui Province Key Laboratory of Wetland Ecosystem Protection and Restoration, Anhui University, Hefei 230601, China; (L.Z.); (P.W.); (J.X.)
| | - Xiaoshuang Ma
- School of Resources and Environmental Engineering/Anhui Province Key Laboratory of Wetland Ecosystem Protection and Restoration, Anhui University, Hefei 230601, China; (L.Z.); (P.W.); (J.X.)
- Department of Resource and Environmental Sciences, Wuhan University, Wuhan 430072, China
- Correspondence:
| | - Penghai Wu
- School of Resources and Environmental Engineering/Anhui Province Key Laboratory of Wetland Ecosystem Protection and Restoration, Anhui University, Hefei 230601, China; (L.Z.); (P.W.); (J.X.)
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei 230601, China
| | - Jiangong Xu
- School of Resources and Environmental Engineering/Anhui Province Key Laboratory of Wetland Ecosystem Protection and Restoration, Anhui University, Hefei 230601, China; (L.Z.); (P.W.); (J.X.)
| |
Collapse
|
9
|
Mapping Smallholder Maize Farms Using Multi-Temporal Sentinel-1 Data in Support of the Sustainable Development Goals. REMOTE SENSING 2021. [DOI: 10.3390/rs13091666] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Reducing food insecurity in developing countries is one of the crucial targets of the Sustainable Development Goals (SDGs). Smallholder farmers play a crucial role in combating food insecurity. However, local planning agencies and governments do not have adequate spatial information on smallholder farmers, and this affects the monitoring of the SDGs. This study utilized Sentinel-1 multi-temporal data to develop a framework for mapping smallholder maize farms and to estimate maize production area as a parameter for supporting the SDGs. We used Principal Component Analysis (PCA) to pixel fuse the multi-temporal data to only three components for each polarization (vertical transmit and vertical receive (VV), vertical transmit and horizontal receive (VH), and VV/VH), which explained more than 70% of the information. The Support Vector Machine (SVM) and Extreme Gradient Boosting (Xgboost) algorithms were used at model-level feature fusion to classify the data. The results show that the adopted strategy of two-stage image fusion was sufficient to map the distribution and estimate production areas for smallholder farms. An overall accuracy of more than 90% for both SVM and Xgboost algorithms was achieved. There was a 3% difference in production area estimation observed between the two algorithms. This framework can be used to generate spatial agricultural information in areas where agricultural survey data are limited and for areas that are affected by cloud coverage. We recommend the use of Sentinel-1 multi-temporal data in conjunction with machine learning algorithms to map smallholder maize farms to support the SDGs.
Collapse
|
10
|
Combination of Feature Selection and CatBoost for Prediction: The First Application to the Estimation of Aboveground Biomass. FORESTS 2021. [DOI: 10.3390/f12020216] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Increasing numbers of explanatory variables tend to result in information redundancy and “dimensional disaster” in the quantitative remote sensing of forest aboveground biomass (AGB). Feature selection of model factors is an effective method for improving the accuracy of AGB estimates. Machine learning algorithms are also widely used in AGB estimation, although little research has addressed the use of the categorical boosting algorithm (CatBoost) for AGB estimation. Both feature selection and regression for AGB estimation models are typically performed with the same machine learning algorithm, but there is no evidence to suggest that this is the best method. Therefore, the present study focuses on evaluating the performance of the CatBoost algorithm for AGB estimation and comparing the performance of different combinations of feature selection methods and machine learning algorithms. AGB estimation models of four forest types were developed based on Landsat OLI data using three feature selection methods (recursive feature elimination (RFE), variable selection using random forests (VSURF), and least absolute shrinkage and selection operator (LASSO)) and three machine learning algorithms (random forest regression (RFR), extreme gradient boosting (XGBoost), and categorical boosting (CatBoost)). Feature selection had a significant influence on AGB estimation. RFE preserved the most informative features for AGB estimation and was superior to VSURF and LASSO. In addition, CatBoost improved the accuracy of the AGB estimation models compared with RFR and XGBoost. AGB estimation models using RFE for feature selection and CatBoost as the regression algorithm achieved the highest accuracy, with root mean square errors (RMSEs) of 26.54 Mg/ha for coniferous forest, 24.67 Mg/ha for broad-leaved forest, 22.62 Mg/ha for mixed forests, and 25.77 Mg/ha for all forests. The combination of RFE and CatBoost had better performance than the VSURF–RFR combination in which random forests were used for both feature selection and regression, indicating that feature selection and regression performed by a single machine learning algorithm may not always ensure optimal AGB estimation. It is promising to extending the application of new machine learning algorithms and feature selection methods to improve the accuracy of AGB estimates.
Collapse
|
11
|
Learning Rotation Domain Deep Mutual Information Using Convolutional LSTM for Unsupervised PolSAR Image Classification. REMOTE SENSING 2020. [DOI: 10.3390/rs12244075] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Deep learning can archive state-of-the-art performance in polarimetric synthetic aperture radar (PolSAR) image classification with plenty of labeled data. However, obtaining large number of accurately labeled samples of PolSAR data is very hard, which limits the practical use of deep learning. Therefore, unsupervised PolSAR image classification is worthy of further investigation that is based on deep learning. Inspired by the superior performance of deep mutual information in natural image feature learning and clustering, an end-to-end Convolutional Long Short Term Memory (ConvLSTM) network is used in order to learn the deep mutual information of polarimetric coherent matrices in the rotation domain with different polarimetric orientation angles (POAs) for unsupervised PolSAR image classification. First, for each pixel, paired “POA-spatio” samples are generated from the polarimetric coherent matrices with different POAs. Second, a special designed ConvLSTM network, along with deep mutual information losses, is used in order to learn the discriminative deep mutual information feature representation of the paired data. Finally, the classification results can be output directly from the trained network model. The proposed method is trained in an end-to-end manner and does not have cumbersome pipelines. Experiments on four real PolSAR datasets show that the performance of proposed method surpasses some state-of-the-art deep learning unsupervised classification methods.
Collapse
|
12
|
DBF Processing in Range-Doppler Domain for MWE SAR Waveform Separation Based on Digital Array-Fed Reflector Antenna. REMOTE SENSING 2020. [DOI: 10.3390/rs12193161] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With the rapid development of the multiple-input multiple-output synthetic aperture radar (MIMO SAR) system, the demands for miniaturization and high gain of antenna are increasing. The digital array-fed reflector antenna has such virtues so that it can play an important role in such system. However, the geometric models and signal models based on a reflector antenna are considerably different from the directly radiating planar antenna. The signal processing for the reflector antenna is more complex and difficult. As a result, the applications of the reflector antenna in SAR system are not as mature as those of the planar antenna. A combination of multidimensional waveform encoding (MWE) technique and digital beamforming (DBF) technology at the receiving end can greatly improve the MIMO SAR system performance, especially ambiguity suppression and waveform separation. This configuration can realize different radar functions and meet multidimensional observation requirements, such as the polarized SAR. Thus, this study combines digital array-fed reflector antenna and the DBF technique in the elevation direction for MWE SAR waveform separation. The echo models for the array-fed reflector antenna and the planar antenna are established based on short-time shift-orthogonal waveforms. In the models, a mismatch in steering vectors is inevitable if DBF processing is continuously performed traditionally in the azimuth-elevation two-dimensional time domain. This mismatch will worsen the waveform separation effect and the image quality. Therefore, we propose a DBF method which is processed in range-Doppler domain. The method enables waveform separation without ambiguity at the receiver. Then, the conventional SAR imaging methods are enabled, and we acquire an ideal SAR image. The simulation results for both point targets and distributed targets prove the effect and feasibility of the proposed DBF method.
Collapse
|
13
|
Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model. Healthcare (Basel) 2020; 8:healthcare8030247. [PMID: 32751894 PMCID: PMC7551910 DOI: 10.3390/healthcare8030247] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Revised: 07/27/2020] [Accepted: 07/29/2020] [Indexed: 11/17/2022] Open
Abstract
In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order to improve the prediction effect of existing models. The combination of convenient sampling and snowball sampling in Xicheng District, Beijing was used to conduct a questionnaire survey on the personal data, eating habits, exercise status and family medical history of 380 middle-aged and elderly people. Then, we trained the models and obtained the disease risk index for each sample with 10-fold cross-validation. Experiments were made to compare the commonly used machine learning algorithms mentioned above and we found that XGBoost had the best prediction effect, with an average accuracy of 0.8909 and the area under the receiver’s working characteristic curve (AUC) was 0.9182. Therefore, due to the superiority of its architecture, XGBoost has more outstanding prediction accuracy and generalization ability than existing algorithms in predicting the risk of type 2 diabetes, which is conducive to the intelligent prevention and control of diabetes in the future.
Collapse
|
14
|
Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-3060-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
|
15
|
Meta-XGBoost for Hyperspectral Image Classification Using Extended MSER-Guided Morphological Profiles. REMOTE SENSING 2020. [DOI: 10.3390/rs12121973] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
To investigate the performance of extreme gradient boosting (XGBoost) in remote sensing image classification tasks, XGBoost was first introduced and comparatively investigated for the spectral-spatial classification of hyperspectral imagery using the extended maximally stable extreme-region-guided morphological profiles (EMSER_MPs) proposed in this study. To overcome the potential issues of XGBoost, meta-XGBoost was proposed as an ensemble XGBoost method with classification and regression tree (CART), dropout-introduced multiple additive regression tree (DART), elastic net regression and parallel coordinate descent-based linear regression (linear) and random forest (RaF) boosters. Moreover, to evaluate the performance of the introduced XGBoost approach with different boosters, meta-XGBoost and EMSER_MPs, well-known and widely accepted classifiers, including support vector machine (SVM), bagging, adaptive boosting (AdaBoost), multi class AdaBoost (MultiBoost), extremely randomized decision trees (ExtraTrees), RaF, classification via random forest regression (CVRFR) and ensemble of nested dichotomies with extremely randomized decision tree (END-ERDT) methods, were considered in terms of the classification accuracy and computational efficiency. The experimental results based on two benchmark hyperspectral data sets confirm the superior performance of EMSER_MPs and EMSER_MPs with mean pixel values within region (EMSER_MPsM) compared to that for morphological profiles (MPs), morphological profile with partial reconstruction (MPPR), extended MPs (EMPs), extended MPPR (EMPPR), maximally stable extreme-region-guided morphological profiles (MSER_MPs) and MSER_MPs with mean pixel values within region (MSER_MPsM) features. The proposed meta-XGBoost algorithm is capable of obtaining better results than XGBoost with the CART, DART, linear and RaF boosters, and it could be an alternative to the other considered classifiers in terms of the classification of hyperspectral images using advanced spectral-spatial features, especially from generalized classification accuracy and model training efficiency perspectives.
Collapse
|
16
|
Object-Based Ensemble Learning for Pan-European Riverscape Units Mapping Based on Copernicus VHR and EU-DEM Data Fusion. REMOTE SENSING 2020. [DOI: 10.3390/rs12071222] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Recent developments in the fields of geographical object-based image analysis (GEOBIA) and ensemble learning (EL) have led the way to the development of automated processing frameworks suitable to tackle large-scale problems. Mapping riverscape units has been recognized in fluvial remote sensing as an important concern for understanding the macrodynamics of a river system and, if applied at large scales, it can be a powerful tool for monitoring purposes. In this study, the potentiality of GEOBIA and EL algorithms were tested for the mapping of key riverscape units along the main European river network. The Copernicus VHR Image Mosaic and the EU Digital Elevation Model (EU-DEM)—both made available through the Copernicus Land Monitoring Service—were integrated within a hierarchical object-based architecture. In a first step, the most well-known EL techniques (bagging, boosting and voting) were tested for the automatic classification of water, sediment bars, riparian vegetation and other floodplain units. Random forest was found to be the best-to-use classifier, and therefore was used in a second phase to classify the entire object-based river network. Finally, an independent validation was performed taking into consideration the polygon area within the accuracy assessment, hence improving the efficiency of the classification accuracy of the GEOBIA-derived map, both globally and by geographical zone. As a result, we automatically processed almost 2 million square kilometers at a spatial resolution of 2.5 meters, producing a riverscape-units map with a global overall accuracy of 0.915, and with per-class F1 accuracies in the range 0.79–0.97. The obtained results may allow for future studies aimed at quantitative, objective and continuous monitoring of river evolutions and fluvial geomorphological processes at the scale of Europe.
Collapse
|
17
|
TAI-SARNET: Deep Transferred Atrous-Inception CNN for Small Samples SAR ATR. SENSORS 2020; 20:s20061724. [PMID: 32204506 PMCID: PMC7146637 DOI: 10.3390/s20061724] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 03/12/2020] [Accepted: 03/14/2020] [Indexed: 11/16/2022]
Abstract
Since Synthetic Aperture Radar (SAR) targets are full of coherent speckle noise, the traditional deep learning models are difficult to effectively extract key features of the targets and share high computational complexity. To solve the problem, an effective lightweight Convolutional Neural Network (CNN) model incorporating transfer learning is proposed for better handling SAR targets recognition tasks. In this work, firstly we propose the Atrous-Inception module, which combines both atrous convolution and inception module to obtain rich global receptive fields, while strictly controlling the parameter amount and realizing lightweight network architecture. Secondly, the transfer learning strategy is used to effectively transfer the prior knowledge of the optical, non-optical, hybrid optical and non-optical domains to the SAR target recognition tasks, thereby improving the model’s recognition performance on small sample SAR target datasets. Finally, the model constructed in this paper is verified to be 97.97% on ten types of MSTAR datasets under standard operating conditions, reaching a mainstream target recognition rate. Meanwhile, the method presented in this paper shows strong robustness and generalization performance on a small number of randomly sampled SAR target datasets.
Collapse
|
18
|
Sun Z, Yu A, Dong Z, Luo H. ScanSAR Interferometry of the Gaofen-3 Satellite with Unsynchronized Repeat-Pass Images. SENSORS (BASEL, SWITZERLAND) 2019; 19:s19214689. [PMID: 31661936 PMCID: PMC6864435 DOI: 10.3390/s19214689] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 10/07/2019] [Accepted: 10/23/2019] [Indexed: 06/10/2023]
Abstract
Gaofen-3 is a Chinese remote sensing satellite with multiple working modes, among which the scanning synthetic aperture radar (ScanSAR) mode is used for wide-swath imaging. synthetic aperture radar (SAR) interferometry in the ScanSAR mode provides the most rapid way to obtain a global digital elevation model (DEM), which can also be realized by Gaofen-3. Gaofen-3 ScanSAR interferometry works in the repeat-pass mode, and image pair non-synchronizations can influence its performance. Non-synchronizations can include differences of burst central times, satellite velocities, and burst durations. Therefore, it is necessary to analyze their influences and improve the interferometric coherence. Meanwhile, interferometric phase compensation and rapid DEM geolocation also need to be considered in interferometric processing. In this paper, interferometric coherence was analyzed in detail, followed by an iterative filtering method, which helped to improve the interferometric performance. Further, a phase compensation method for Gaofen-3 was proposed to compensate for the phase error caused by the unsynchronized azimuth time offset of image pair, and a closed-form solution of DEM geolocation with ground control point (GCP) information was derived. Application of our methods to a pair of Gaofen-3 interferometric images showed that these methods were able to process the images with good accuracy and efficiency. Notably, these analysis and processing methods can also be applied to other SAR satellites in the ScanSAR mode to obtain DEMs with high quality.
Collapse
Affiliation(s)
- Zaoyu Sun
- College of Electronic Science and Technology, National University of Defense Technology, No. 109 Deya Road, Changsha 410073, China.
| | - Anxi Yu
- College of Electronic Science and Technology, National University of Defense Technology, No. 109 Deya Road, Changsha 410073, China.
| | - Zhen Dong
- College of Electronic Science and Technology, National University of Defense Technology, No. 109 Deya Road, Changsha 410073, China.
| | - Hui Luo
- College of Electronic Science and Technology, National University of Defense Technology, No. 109 Deya Road, Changsha 410073, China.
| |
Collapse
|
19
|
High-Resolution Vegetation Mapping Using eXtreme Gradient Boosting Based on Extensive Features. REMOTE SENSING 2019. [DOI: 10.3390/rs11121505] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Accurate mapping of vegetation is a premise for conserving, managing, and sustainably using vegetation resources, especially in conditions of intensive human activities and accelerating global changes. However, it is still challenging to produce high-resolution multiclass vegetation map in high accuracy, due to the incapacity of traditional mapping techniques in distinguishing mosaic vegetation classes with subtle differences and the paucity of fieldwork data. This study created a workflow by adopting a promising classifier, extreme gradient boosting (XGBoost), to produce accurate vegetation maps of two strikingly different cases (the Dzungarian Basin in China and New Zealand) based on extensive features and abundant vegetation data. For the Dzungarian Basin, a vegetation map with seven vegetation types, 17 subtypes, and 43 associations was produced with an overall accuracy of 0.907, 0.801, and 0.748, respectively. For New Zealand, a map of 10 habitats and a map of 41 vegetation classes were produced with 0.946, and 0.703 overall accuracy, respectively. The workflow incorporating simplified field survey procedures outperformed conventional field survey and remote sensing based methods in terms of accuracy and efficiency. In addition, it opens a possibility of building large-scale, high-resolution, and timely vegetation monitoring platforms for most terrestrial ecosystems worldwide with the aid of Google Earth Engine and citizen science programs.
Collapse
|
20
|
Early Season Mapping of Sugarcane by Applying Machine Learning Algorithms to Sentinel-1A/2 Time Series Data: A Case Study in Zhanjiang City, China. REMOTE SENSING 2019. [DOI: 10.3390/rs11070861] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
More than 90% of the sugar production in China comes from sugarcane, which is widely grown in South China. Optical image time series have proven to be efficient for sugarcane mapping. There are, however, two limitations associated with previous research: one is that the critical observations during the sugarcane growing season are limited due to frequent cloudy weather in South China; the other is that the classification method requires imagery time series covering the entire growing season, which reduces the time efficiency. The Sentinel-1A (S1A) synthetic aperture radar (SAR) data featuring relatively high spatial-temporal resolution provides an ideal data source for all-weather observations. In this study, we attempted to develop a method for the early season mapping of sugarcane. First, we proposed a framework consisting of two procedures: initial sugarcane mapping using the S1A SAR imagery time series, followed by non-vegetation removal using Sentinel-2 optical imagery. Second, we tested the framework using an incremental classification strategy based on S1A imagery covering the entire 2017–2018 sugarcane season. The study area was in Suixi and Leizhou counties of Zhanjiang city, China. Results indicated that an acceptable accuracy, in terms of Kappa coefficient, can be achieved to a level above 0.902 using time series three months before sugarcane harvest. In general, sugarcane mapping utilizing the combination of VH + VV as well as VH polarization alone outperformed mapping using VV alone. Although the XGBoost classifier with VH + VV polarization achieved a maximum accuracy that was slightly lower than the random forest (RF) classifier, the XGBoost shows promising performance in that it was more robust to overfitting with noisy VV time series and the computation speed was 7.7 times faster than RF classifier. The total sugarcane areas in Suixi and Leizhou for the 2017–2018 harvest year estimated by this study were approximately 598.95 km2 and 497.65 km2, respectively. The relative accuracy of the total sugarcane mapping area was approximately 86.3%.
Collapse
|
21
|
Polarimetric Target Decompositions and Light Gradient Boosting Machine for Crop Classification: A Comparative Evaluation. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2019. [DOI: 10.3390/ijgi8020097] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In terms of providing various scattering mechanisms, polarimetric target decompositions provide certain benefits for the interpretation of PolSAR images. This paper tested the capabilities of different polarimetric target decompositions in crop classification, while using a recently launched ensemble learning algorithm—namely Light Gradient Boosting Machine (LightGBM). For the classification of different crops (maize, potato, wheat, sunflower, and alfalfa) in the test site, multi-temporal polarimetric C-band RADARSAT-2 images were acquired over an agricultural area near Konya, Turkey. Four different decomposition models (Cloude–Pottier, Freeman–Durden, Van Zyl, and Yamaguchi) were employed to evaluate polarimetric target decomposition for crop classification. Besides the polarimetric target decomposed parameters, the original polarimetric features (linear backscatter coefficients, coherency, and covariance matrices) were also incorporated for crop classification. The experimental results demonstrated that polarimetric target decompositions, with the exception of Cloude–Pottier, were found to be superior to the original features in terms of overall classification accuracy. The highest classification accuracy (92.07%) was achieved by Yamaguchi, whereas the lowest (75.99%) was achieved by the covariance matrix. Model-based decompositions achieved higher performance with respect to eigenvector-based decompositions in terms of class-based accuracies. Furthermore, the results emphasize the added benefits of model-based decompositions for crop classification using PolSAR data.
Collapse
|
22
|
Comparison of Approaches for Urban Functional Zones Classification Based on Multi-Source Geospatial Data: A Case Study in Yuzhong District, Chongqing, China. SUSTAINABILITY 2019. [DOI: 10.3390/su11030660] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Accurate and timely classification and monitoring of urban functional zones prove to be significant in rapidly developing cities, to better understand the real and varying urban functions of cities to support urban planning and management. Many efforts have been undertaken to identify urban functional zones using various classification approaches and multi-source geospatial datasets. The complexity of this category of classification poses tremendous challenges to these studies especially in terms of classification accuracy, but on the opposite, the rapid development of machine learning technologies provides us with new opportunities. In this study, a set of commonly used urban functional zones classification approaches, including Multinomial Logistic Regression, K-Nearest Neighbors, Decision Tree, Support Vector Machine (SVM), and Random Forest, are examined and compared with the newly developed eXtreme Gradient Boosting (XGBoost) model, using the case study of Yuzhong District, Chongqing, China. The investigation is based on multi-variate geospatial data, including night-time imagery, geotagged Weibo data, points of interest (POI) from Gaode, and Baidu Heat Map. This study is the first endeavor of implementing the XGBoost model in the field of urban functional zones classification. The results suggest that the XGBoost classification model performed the best and was able to achieve an accuracy of 88.05%, which is significantly higher than the other commonly used approaches. In addition, the integration of night-time imagery, geotagged Weibo data, POI from Gaode, and Baidu Heat Map has also demonstrated their values for the classification of urban functional zones in this case study.
Collapse
|
23
|
Imaging Time Series for the Classification of EMI Discharge Sources. SENSORS 2018; 18:s18093098. [PMID: 30223496 PMCID: PMC6163566 DOI: 10.3390/s18093098] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 09/11/2018] [Accepted: 09/12/2018] [Indexed: 11/24/2022]
Abstract
In this work, we aim to classify a wider range of Electromagnetic Interference (EMI) discharge sources collected from new power plant sites across multiple assets. This engenders a more complex and challenging classification task. The study involves an investigation and development of new and improved feature extraction and data dimension reduction algorithms based on image processing techniques. The approach is to exploit the Gramian Angular Field technique to map the measured EMI time signals to an image, from which the significant information is extracted while removing redundancy. The image of each discharge type contains a unique fingerprint. Two feature reduction methods called the Local Binary Pattern (LBP) and the Local Phase Quantisation (LPQ) are then used within the mapped images. This provides feature vectors that can be implemented into a Random Forest (RF) classifier. The performance of a previous and the two new proposed methods, on the new database set, is compared in terms of classification accuracy, precision, recall, and F-measure. Results show that the new methods have a higher performance than the previous one, where LBP features achieve the best outcome.
Collapse
|
24
|
A Generalized Zero-Shot Learning Framework for PolSAR Land Cover Classification. REMOTE SENSING 2018. [DOI: 10.3390/rs10081307] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Most supervised classification methods for polarimetric synthetic aperture radar (PolSAR) data rely on abundant labeled samples, and cannot tackle the problem that categorizes or infers unseen land cover classes without training samples. Aiming to categorize instances from both seen and unseen classes simultaneously, a generalized zero-shot learning (GZSL)-based PolSAR land cover classification framework is proposed. The semantic attributes are first collected to describe characteristics of typical land cover types in PolSAR images, and semantic relevance between attributes is established to relate unseen and seen classes. Via latent embedding, the projection between mid-level polarimetric features and semantic attributes for each land cover class can be obtained during the training stage. The GZSL model for PolSAR data is constructed by mid-level polarimetric features, the projection relationship, and the semantic relevance. Finally, the labels of the test instances can be predicted, even for some unseen classes. Experiments on three real RadarSAT-2 PolSAR datasets show that the proposed framework can classify both seen and unseen land cover classes with limited kinds of training classes, which reduces the requirement for labeled samples. The classification accuracy of the unseen land cover class reaches about 73% if semantic relevance exists during the training stage.
Collapse
|
25
|
Speckle Filtering of GF-3 Polarimetric SAR Data with Joint Restriction Principle. SENSORS 2018; 18:s18051533. [PMID: 29757231 PMCID: PMC5981477 DOI: 10.3390/s18051533] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 05/08/2018] [Accepted: 05/09/2018] [Indexed: 11/17/2022]
Abstract
Polarimetric SAR (PolSAR) scattering characteristics of imagery are always obtained from the second order moments estimation of multi-polarization data, that is, the estimation of covariance or coherency matrices. Due to the extra-paths that signal reflected from separate scatterers within the resolution cell has to travel, speckle noise always exists in SAR images and has a severe impact on the scattering performance, especially on single look complex images. In order to achieve high accuracy in estimating covariance or coherency matrices, three aspects are taken into consideration: (1) the edges and texture of the scene are distinct after speckle filtering; (2) the statistical characteristic should be similar to the object pixel; and (3) the polarimetric scattering signature should be preserved, in addition to speckle reduction. In this paper, a joint restriction principle is proposed to meet the requirement. Three different restriction principles are introduced to the processing of speckle filtering. First, a new template, which is more suitable for the point or line targets, is designed to ensure the morphological consistency. Then, the extent sigma filter is used to restrict the pixels in the template aforementioned to have an identical statistic characteristic. At last, a polarimetric similarity factor is applied to the same pixels above, to guarantee the similar polarimetric features amongst the optional pixels. This processing procedure is named as speckle filtering with joint restriction principle and the approach is applied to GF-3 polarimetric SAR data acquired in San Francisco, CA, USA. Its effectiveness of keeping the image sharpness and preserving the scattering mechanism as well as speckle reduction is validated by the comparison with boxcar filters and refined Lee filter.
Collapse
|