1
|
Guo J, Zhang Z, Guo G, Xiao H, Zhao Q, Zhang C, Lv H, Zhu Z, Wang C. Optimized Random Forest Method for 3D Evaluation of Coalbed Methane Content Using Geophysical Logging Data. ACS OMEGA 2024; 9:35769-35788. [PMID: 39184457 PMCID: PMC11339842 DOI: 10.1021/acsomega.4c04305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 07/28/2024] [Accepted: 07/31/2024] [Indexed: 08/27/2024]
Abstract
Accurate evaluation of coalbed methane (CBM) content is crucial for effective exploration and development. Traditional gas content measurement methods based on laboratory analysis of drill core samples are costly, whereas geophysical logging methods offer a cost-effective alternative by providing continuous high-resolution profiles of rock layer physical properties. However, the relationship between CBM content and geophysical logging data is complex and nonlinear, necessitating an advanced prediction method. This study focuses on the No. 3 coal seam in the Shizhuang South Block of the Qinshui Basin, utilizing geophysical logging data and 148 sets of laboratory core samples. We employed the Random Forest (RF) method optimized with a simulated annealing-genetic algorithm (SA-GA) to develop the SA-GA-RF model for evaluating CBM content. The model's performance was validated using test data and new CBM well data, and it was applied to calculate the vertical gas content profiles of No. 3 coal seam across 128 wells. The SA-GA-RF model demonstrated an average relative error of 13.13% in the test data set, outperforming Backpropagation Neural Network (BPNN), Least Squares Support Vector Machine (LSSVM), Extreme Learning Machine (ELM), and multivariate regression (MR) methods. The model also exhibited strong generalizability in new wells and improved model-building efficiency compared to traditional cross-validation grid search methods. The construction of a three-dimensional CBM content model, incorporating well coordinates and elevation data, allowed for detailed identification of high gas content areas and layers. This three-dimensional model offers a more precise characterization than traditional two-dimensional isopleth maps, providing valuable insights for CBM exploration, reserve evaluation, and production optimization.
Collapse
Affiliation(s)
- Jianhong Guo
- Key
Laboratory of Exploration Technologies for Oil and Gas Resources,
Ministry of Education, Yangtze University, Wuhan 430100, China
- College
of Geophysics and Petroleum Resources, Yangtze
University, Wuhan 430100, China
| | - Zhansong Zhang
- Key
Laboratory of Exploration Technologies for Oil and Gas Resources,
Ministry of Education, Yangtze University, Wuhan 430100, China
- College
of Geophysics and Petroleum Resources, Yangtze
University, Wuhan 430100, China
| | | | - Hang Xiao
- Research
Institute of Exploration & Development, Sinopec Jianghan Oilfield
Company, Wuhan 430223, China
| | - Qing Zhao
- Key
Laboratory of Exploration Technologies for Oil and Gas Resources,
Ministry of Education, Yangtze University, Wuhan 430100, China
- College
of Geophysics and Petroleum Resources, Yangtze
University, Wuhan 430100, China
| | - Chaomo Zhang
- Key
Laboratory of Exploration Technologies for Oil and Gas Resources,
Ministry of Education, Yangtze University, Wuhan 430100, China
- College
of Geophysics and Petroleum Resources, Yangtze
University, Wuhan 430100, China
| | - Hengyang Lv
- Key
Laboratory of Exploration Technologies for Oil and Gas Resources,
Ministry of Education, Yangtze University, Wuhan 430100, China
- College
of Geophysics and Petroleum Resources, Yangtze
University, Wuhan 430100, China
| | - Zuomin Zhu
- Key
Laboratory of Exploration Technologies for Oil and Gas Resources,
Ministry of Education, Yangtze University, Wuhan 430100, China
- College
of Geophysics and Petroleum Resources, Yangtze
University, Wuhan 430100, China
| | - Can Wang
- Hubei
Geol Bur, Hydrogeol & Engn Geol Inst, Jingzhou 434007, China
| |
Collapse
|
2
|
Lasko K, O'Neill FD, Sava E. Automated Mapping of Land Cover Type within International Heterogenous Landscapes Using Sentinel-2 Imagery with Ancillary Geospatial Data. SENSORS (BASEL, SWITZERLAND) 2024; 24:1587. [PMID: 38475125 DOI: 10.3390/s24051587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 02/01/2024] [Accepted: 02/27/2024] [Indexed: 03/14/2024]
Abstract
A near-global framework for automated training data generation and land cover classification using shallow machine learning with low-density time series imagery does not exist. This study presents a methodology to map nine-class, six-class, and five-class land cover using two dates (winter and non-winter) of a Sentinel-2 granule across seven international sites. The approach uses a series of spectral, textural, and distance decision functions combined with modified ancillary layers (such as global impervious surface and global tree cover) to create binary masks from which to generate a balanced set of training data applied to a random forest classifier. For the land cover masks, stepwise threshold adjustments were applied to reflectance, spectral index values, and Euclidean distance layers, with 62 combinations evaluated. Global (all seven scenes) and regional (arid, tropics, and temperate) adaptive thresholds were computed. An annual 95th and 5th percentile NDVI composite was used to provide temporal corrections to the decision functions, and these corrections were compared against the original model. The accuracy assessment found that the regional adaptive thresholds for both the two-date land cover and the temporally corrected land cover could accurately map land cover type within nine-class (68.4% vs. 73.1%), six-class (79.8% vs. 82.8%), and five-class (80.1% vs. 85.1%) schemes. Lastly, the five-class and six-class models were compared with a manually labeled deep learning model (Esri), where they performed with similar accuracies (five classes: Esri 80.0 ± 3.4%, region corrected 85.1 ± 2.9%). The results highlight not only performance in line with an intensive deep learning approach, but also that reasonably accurate models can be created without a full annual time series of imagery.
Collapse
Affiliation(s)
- Kristofer Lasko
- Geospatial Research Laboratory, Engineer Research and Development Center, 7701 Telegraph Road, Bldg 2592, Alexandria, VA 22315, USA
| | - Francis D O'Neill
- Geospatial Research Laboratory, Engineer Research and Development Center, 7701 Telegraph Road, Bldg 2592, Alexandria, VA 22315, USA
| | - Elena Sava
- Geospatial Research Laboratory, Engineer Research and Development Center, 7701 Telegraph Road, Bldg 2592, Alexandria, VA 22315, USA
| |
Collapse
|
3
|
Lloyd M, Ganji A, Xu J, Venuta A, Simon L, Zhang M, Saeedi M, Yamanouchi S, Apte J, Hong K, Hatzopoulou M, Weichenthal S. Predicting spatial variations in annual average outdoor ultrafine particle concentrations in Montreal and Toronto, Canada: Integrating land use regression and deep learning models. ENVIRONMENT INTERNATIONAL 2023; 178:108106. [PMID: 37544265 DOI: 10.1016/j.envint.2023.108106] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 06/28/2023] [Accepted: 07/19/2023] [Indexed: 08/08/2023]
Abstract
BACKGROUND Concentrations of outdoor ultrafine particles (UFP; <0.1 µm) and black carbon (BC) can vary greatly within cities and long-term exposures to these pollutants have been associated with a variety of adverse health outcomes. OBJECTIVE This study integrated multiple approaches to develop new models to estimate within-city spatial variations in annual median (i.e. average) outdoor UFP and BC concentrations as well as mean UFP size in Canada's two largest cities, Montreal and Toronto. METHODS We conducted year-long mobile monitoring campaigns in each city that included evenings and weekends. We developed generalized additive models trained on land use parameters and deep Convolutional Neural Network (CNN) models trained on satellite-view images. Using predictions from these models, we developed final combined models. RESULTS In Toronto, the median observed UFP concentration, UFP size, and BC concentration values were 16,172pt/cm3, 33.7 nm, and 1225 ng/m3, respectively. In Montreal, the median observed UFP concentration, UFP size, and BC concentration values were 14,702pt/cm3, 29.7 nm, and 1060 ng/m3, respectively. For all pollutants in both cities, the proportion of spatial variation explained (i.e., R2) was slightly greater (1-2 percentage points) for the combined models than the generalized additive models and a greater (approximately 10 percentage points) than the deep CNN models. The Toronto combined model R2 values in the test set were 0.73, 0.55, and 0.61 for UFP concentrations, UFP size, and BC concentration, respectively. The Montreal combined model R2 values were 0.60, 0.49, and 0.60 for UFP concentration, UFP size, and BC concentration models respectively. For each pollutant, predictions from the combined, deep CNN, and generalized additive models were highly correlated with each other and differences between models were explored in sensitivity analyses. CONCLUSION Predictions from these models are available to support future epidemiological research examining long-term health impacts of outdoor UFPs and BC.
Collapse
Affiliation(s)
- Marshall Lloyd
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Québec H3A 1G1, Canada.
| | - Arman Ganji
- Department of Civil and Mineral Engineering, University of Toronto, Toronto, Ontario M5S 1A4, Canada.
| | - Junshi Xu
- Department of Civil and Mineral Engineering, University of Toronto, Toronto, Ontario M5S 1A4, Canada.
| | - Alessya Venuta
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Québec H3A 1G1, Canada.
| | - Leora Simon
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Québec H3A 1G1, Canada.
| | - Mingqian Zhang
- Department of Civil and Mineral Engineering, University of Toronto, Toronto, Ontario M5S 1A4, Canada.
| | - Milad Saeedi
- Department of Civil and Mineral Engineering, University of Toronto, Toronto, Ontario M5S 1A4, Canada.
| | - Shoma Yamanouchi
- Department of Civil and Mineral Engineering, University of Toronto, Toronto, Ontario M5S 1A4, Canada.
| | - Joshua Apte
- Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA 94720, United States; School of Public Health, University of California, Berkeley, CA 94720, United States.
| | - Kris Hong
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Québec H3A 1G1, Canada.
| | - Marianne Hatzopoulou
- Department of Civil and Mineral Engineering, University of Toronto, Toronto, Ontario M5S 1A4, Canada.
| | - Scott Weichenthal
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Québec H3A 1G1, Canada.
| |
Collapse
|
4
|
Ding C, Pereira T, Xiao R, Lee RJ, Hu X. Impact of Label Noise on the Learning Based Models for a Binary Classification of Physiological Signal. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22197166. [PMID: 36236265 PMCID: PMC9572105 DOI: 10.3390/s22197166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/12/2022] [Accepted: 09/15/2022] [Indexed: 06/13/2023]
Abstract
Label noise is omnipresent in the annotations process and has an impact on supervised learning algorithms. This work focuses on the impact of label noise on the performance of learning models by examining the effect of random and class-dependent label noise on a binary classification task: quality assessment for photoplethysmography (PPG). PPG signal is used to detect physiological changes and its quality can have a significant impact on the subsequent tasks, which makes PPG quality assessment a particularly good target for examining the impact of label noise in the field of biomedicine. Random and class-dependent label noise was introduced separately into the training set to emulate the errors associated with fatigue and bias in labeling data samples. We also tested different representations of the PPG, including features defined by domain experts, 1D raw signal and 2D image. Three different classifiers are tested on the noisy training data, including support vector machine (SVM), XGBoost, 1D Resnet and 2D Resnet, which handle three representations, respectively. The results showed that the two deep learning models were more robust than the two traditional machine learning models for both the random and class-dependent label noise. From the representation perspective, the 2D image shows better robustness compared to the 1D raw signal. The logits from three classifiers are also analyzed, the predicted probabilities intend to be more dispersed when more label noise is introduced. From this work, we investigated various factors related to label noise, including representations, label noise type, and data imbalance, which can be a good guidebook for designing more robust methods for label noise in future work.
Collapse
Affiliation(s)
- Cheng Ding
- Department of Biomedical Engineering, Georgia Institute of Technology, Emory University, Atlanta, GA 30332, USA
| | - Tania Pereira
- INESC TEC-Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal
| | - Ran Xiao
- Nell Hodgson Woodruff School of Nursing, Emory University, Atlanta, GA 30332, USA
| | - Randall J. Lee
- School of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Xiao Hu
- Nell Hodgson Woodruff School of Nursing, Emory University, Atlanta, GA 30332, USA
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA 30332, USA
- Department of Computer Science, College of Arts and Sciences, Emory University, Atlanta, GA 30332, USA
| |
Collapse
|
5
|
Zhang Y, Zong R, Shang L, Kou Z, Wang D. An active one-shot learning approach to recognizing land usage from class-wise sparse satellite imagery in smart urban sensing. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
6
|
Remote Sensing Mapping of Build-Up Land with Noisy Label via Fault-Tolerant Learning. REMOTE SENSING 2022. [DOI: 10.3390/rs14092263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
China’s urbanization has dramatically accelerated in recent decades. Land for urban build-up has changed not only in large cities but also in small counties. Land cover mapping is one of the fundamental tasks in the field of remote sensing and has received great attention. However, most current mapping requires a significant manual effort for labeling or classification. It is of great practical value to use the existing low-resolution label data for the classification of higher resolution images. In this regard, this work proposes a method based on noise-label learning for fine-grained mapping of urban build-up land in a county in central China. Specifically, this work produces a build-up land map with a resolution of 10 m based on a land cover map with a resolution of 30 m. Experimental results show that the accuracy of the results is improved by 5.5% compared with that of the baseline method. This notion indicates that the time required to produce a fine land cover map can be significantly reduced using existing coarse-grained data.
Collapse
|
7
|
Jiang J, Ma J, Liu X. Multilayer Spectral-Spatial Graphs for Label Noisy Robust Hyperspectral Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:839-852. [PMID: 33090961 DOI: 10.1109/tnnls.2020.3029523] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In hyperspectral image (HSI) analysis, label information is a scarce resource and it is unavoidably affected by human and nonhuman factors, resulting in a large amount of label noise. Although most of the recent supervised HSI classification methods have achieved good classification results, their performance drastically decreases when the training samples contain label noise. To address this issue, we propose a label noise cleansing method based on spectral-spatial graphs (SSGs). In particular, an affinity graph is constructed based on spectral and spatial similarity, in which pixels in a superpixel segmentation-based homogeneous region are connected, and their similarities are measured by spectral feature vectors. Then, we use the constructed affinity graph to regularize the process of label noise cleansing. In this manner, we transform label noise cleansing to an optimization problem with a graph constraint. To fully utilize spatial information, we further develop multiscale segmentation-based multilayer SSGs (MSSGs). It can efficiently merge the complementary information of multilayer graphs and thus provides richer spatial information compared with any single-layer graph obtained from isolation segmentation. Experimental results show that MSSG reduces the level of label noise. Compared with the state of the art, the proposed MSSG method exhibits significantly enhanced classification accuracy toward the training data with noisy labels. The significant advantages of the proposed method over four major classifiers are also demonstrated. The source code is available at https://github.com/junjun-jiang/MSSG.
Collapse
|
8
|
Makhamreh Z, Hdoush AAA, Ziadat F, Kakish S. Detection of seasonal land use pattern and irrigated crops in drylands using multi-temporal sentinel images. ENVIRONMENTAL EARTH SCIENCES 2022; 81:120. [DOI: 10.1007/s12665-022-10249-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 01/25/2022] [Indexed: 09/02/2023]
|
9
|
Comparison of the Novel Probabilistic Self-Optimizing Vectorized Earth Observation Retrieval Classifier with Common Machine Learning Algorithms. REMOTE SENSING 2022. [DOI: 10.3390/rs14020378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The Vectorized Earth Observation Retrieval (VEOR) algorithm is a novel algorithm suited to the efficient supervised classification of large Earth Observation (EO) datasets. VEOR addresses shortcomings in well-established machine learning methods with an emphasis on numerical performance. Its characteristics include (1) derivation of classification probability; (2) objective selection of classification features that maximize Cohen’s kappa coefficient (κ) derived from iterative “leave-one-out” cross-validation; (3) reduced sensitivity of the classification results to imbalanced classes; (4) smoothing of the classification probability field to reduce noise/mislabeling; (5) numerically efficient retrieval based on a pre-computed look-up vector (LUV); and (6) separate parametrization of the algorithm for each discrete feature class (e.g., land cover). Within this study, the performance of the VEOR classifier was compared to other commonly used machine learning algorithms: K-nearest neighbors, support vector machines, Gaussian process, decision trees, random forest, artificial neural networks, AdaBoost, Naive Bayes and Quadratic Discriminant Analysis. Firstly, the comparison was performed using synthetic 2D (two-dimensional) datasets featuring different sample sizes, levels of noise (i.e., mislabeling) and class imbalance. Secondly, the same experiments were repeated for 7D datasets consisting of informative, redundant and insignificant features. Ultimately, the benchmarking of the classifiers involved cloud discrimination using MODIS satellite spectral measurements and a reference cloud mask derived from combined CALIOP lidar and CPR radar data. The results revealed that the proposed VEOR algorithm accurately discriminated cloud cover using MODIS data and accurately classified large synthetic datasets with low or moderate levels of noise and class imbalance. On the contrary, VEOR did not feature good classification skills for significantly distorted or for small datasets. Nevertheless, the comparisons performed proved that VEOR was within the 3–4 most accurate classifiers and that it can be applied to large Earth Observation datasets.
Collapse
|
10
|
A Comparison of Three Airborne Laser Scanner Types for Species Identification of Individual Trees. SENSORS 2021; 22:s22010035. [PMID: 35009577 PMCID: PMC8747214 DOI: 10.3390/s22010035] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 12/07/2021] [Accepted: 12/20/2021] [Indexed: 11/16/2022]
Abstract
Species identification is a critical factor for obtaining accurate forest inventories. This paper compares the same method of tree species identification (at the individual crown level) across three different types of airborne laser scanning systems (ALS): two linear lidar systems (monospectral and multispectral) and one single-photon lidar (SPL) system to ascertain whether current individual tree crown (ITC) species classification methods are applicable across all sensors. SPL is a new type of sensor that promises comparable point densities from higher flight altitudes, thereby increasing lidar coverage. Initial results indicate that the methods are indeed applicable across all of the three sensor types with broadly similar overall accuracies (Hardwood/Softwood, 83-90%; 12 species, 46-54%; 4 species, 68-79%), with SPL being slightly lower in all cases. The additional intensity features that are provided by multispectral ALS appear to be more beneficial to overall accuracy than the higher point density of SPL. We also demonstrate the potential contribution of lidar time-series data in improving classification accuracy (Hardwood/Softwood, 91%; 12 species, 58%; 4 species, 84%). Possible causes for lower SPL accuracy are (a) differences in the nature of the intensity features and (b) differences in first and second return distributions between the two linear systems and SPL. We also show that segmentation (and field-identified training crowns deriving from segmentation) that is performed on an initial dataset can be used on subsequent datasets with similar overall accuracy. To our knowledge, this is the first study to compare these three types of ALS systems for species identification at the individual tree level.
Collapse
|
11
|
Giry Fouquet E, Fauvel M, Mallet C. Fast estimation for robust supervised classification with mixture models. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.10.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
12
|
Satellite Image Classification Using a Hierarchical Ensemble Learning and Correlation Coefficient-Based Gravitational Search Algorithm. REMOTE SENSING 2021. [DOI: 10.3390/rs13214351] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Satellite image classification is widely used in various real-time applications, such as the military, geospatial surveys, surveillance and environmental monitoring. Therefore, the effective classification of satellite images is required to improve classification accuracy. In this paper, the combination of Hierarchical Framework and Ensemble Learning (HFEL) and optimal feature selection is proposed for the precise identification of satellite images. The HFEL uses three different types of Convolutional Neural Networks (CNN), namely AlexNet, LeNet-5 and a residual network (ResNet), to extract the appropriate features from images of the hierarchical framework. Additionally, the optimal features from the feature set are extracted using the Correlation Coefficient-Based Gravitational Search Algorithm (CCGSA). Further, the Multi Support Vector Machine (MSVM) is used to classify the satellite images by extracted features from the fully connected layers of the CNN and selected features of the CCGSA. Hence, the combination of HFEL and CCGSA is used to obtain the precise classification over different datasets such as the SAT-4, SAT-6 and Eurosat datasets. The performance of the proposed HFEL–CCGSA is analyzed in terms of accuracy, precision and recall. The experimental results show that the HFEL–CCGSA method provides effective classification over the satellite images. The classification accuracy of the HFEL–CCGSA method is 99.99%, which is high when compared to AlexNet, LeNet-5 and ResNet.
Collapse
|
13
|
Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures. INFORMATION 2021. [DOI: 10.3390/info12070266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Land Use/Land Cover maps has been a topic of interest for the remote sensing community for several years, but it is still fraught with technical challenges. One such challenge is the imbalanced nature of most remotely sensed data. The asymmetric class distribution impacts negatively the performance of classifiers and adds a new source of error to the production of these maps. In this paper, we address the imbalanced learning problem, by using K-means and the Synthetic Minority Oversampling Technique (SMOTE) as an improved oversampling algorithm. K-means SMOTE improves the quality of newly created artificial data by addressing both the between-class imbalance, as traditional oversamplers do, but also the within-class imbalance, avoiding the generation of noisy data while effectively overcoming data imbalance. The performance of K-means SMOTE is compared to three popular oversampling methods (Random Oversampling, SMOTE and Borderline-SMOTE) using seven remote sensing benchmark datasets, three classifiers (Logistic Regression, K-Nearest Neighbors and Random Forest Classifier) and three evaluation metrics using a five-fold cross-validation approach with three different initialization seeds. The statistical analysis of the results show that the proposed method consistently outperforms the remaining oversamplers producing higher quality land cover classifications. These results suggest that LULC data can benefit significantly from the use of more sophisticated oversamplers as spectral signatures for the same class can vary according to geographical distribution.
Collapse
|
14
|
Sentinel-1 and 2 Time-Series for Vegetation Mapping Using Random Forest Classification: A Case Study of Northern Croatia. REMOTE SENSING 2021. [DOI: 10.3390/rs13122321] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Land-cover (LC) mapping in a morphologically heterogeneous landscape area is a challenging task since various LC classes (e.g., crop types in agricultural areas) are spectrally similar. Most research is still mostly relying on optical satellite imagery for these tasks, whereas synthetic aperture radar (SAR) imagery is often neglected. Therefore, this research assessed the classification accuracy using the recent Sentinel-1 (S1) SAR and Sentinel-2 (S2) time-series data for LC mapping, especially vegetation classes. Additionally, ancillary data, such as texture features, spectral indices from S1 and S2, respectively, as well as digital elevation model (DEM), were used in different classification scenarios. Random Forest (RF) was used for classification tasks using a proposed hybrid reference dataset derived from European Land Use and Coverage Area Frame Survey (LUCAS), CORINE, and Land Parcel Identification Systems (LPIS) LC database. Based on the RF variable selection using Mean Decrease Accuracy (MDA), the combination of S1 and S2 data yielded the highest overall accuracy (OA) of 91.78%, with a total disagreement of 8.22%. The most pertinent features for vegetation mapping were GLCM Mean and Variance for S1, NDVI, along with Red and SWIR band for S2, whereas the digital elevation model produced major classification enhancement as an input feature. The results of this study demonstrated that the aforementioned approach (i.e., RF using a hybrid reference dataset) is well-suited for vegetation mapping using Sentinel imagery, which can be applied for large-scale LC classifications.
Collapse
|
15
|
Continental-Scale Land Cover Mapping at 10 m Resolution Over Europe (ELC10). REMOTE SENSING 2021. [DOI: 10.3390/rs13122301] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Land cover maps are important tools for quantifying the human footprint on the environment and facilitate reporting and accounting to international agreements addressing the Sustainable Development Goals. Widely used European land cover maps such as CORINE (Coordination of Information on the Environment) are produced at medium spatial resolutions (100 m) and rely on diverse data with complex workflows requiring significant institutional capacity. We present a 10 m resolution land cover map (ELC10) of Europe based on a satellite-driven machine learning workflow that is annually updatable. A random forest classification model was trained on 70K ground-truth points from the LUCAS (Land Use/Cover Area Frame Survey) dataset. Within the Google Earth Engine cloud computing environment, the ELC10 map can be generated from approx. 700 TB of Sentinel imagery within approx. 4 days from a single research user account. The map achieved an overall accuracy of 90% across eight land cover classes and could account for statistical unit land cover proportions within 3.9% (R2 = 0.83) of the actual value. These accuracies are higher than that of CORINE (100 m) and other 10 m land cover maps including S2GLC and FROM-GLC10. Spectro-temporal metrics that capture the phenology of land cover classes were most important in producing high mapping accuracies. We found that the atmospheric correction of Sentinel-2 and the speckle filtering of Sentinel-1 imagery had a minimal effect on enhancing the classification accuracy (<1%). However, combining optical and radar imagery increased accuracy by 3% compared to Sentinel-2 alone and by 10% compared to Sentinel-1 alone. The addition of auxiliary data (terrain, climate and night-time lights) increased accuracy by an additional 2%. By using the centroid pixels from the LUCAS Copernicus module polygons we increased accuracy by <1%, revealing that random forests are robust against contaminated training data. Furthermore, the model requires very little training data to achieve moderate accuracies—the difference between 5K and 50K LUCAS points is only 3% (86% vs. 89%). This implies that significantly less resources are necessary for making in situ survey data (such as LUCAS) suitable for satellite-based land cover classification. At 10 m resolution, the ELC10 map can distinguish detailed landscape features like hedgerows and gardens, and therefore holds potential for aerial statistics at the city borough level and monitoring property-level environmental interventions (e.g., tree planting). Due to the reliance on purely satellite-based input data, the ELC10 map can be continuously updated independent of any country-specific geographic datasets.
Collapse
|
16
|
Li Y, Zhang Y, Zhu Z. Error-Tolerant Deep Learning for Remote Sensing Image Scene Classification. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1756-1768. [PMID: 32413949 DOI: 10.1109/tcyb.2020.2989241] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Due to its various application potentials, the remote sensing image scene classification (RSSC) has attracted a broad range of interests. While the deep convolutional neural network (CNN) has recently achieved tremendous success in RSSC, its superior performances highly depend on a large number of accurately labeled samples which require lots of time and manpower to generate for a large-scale remote sensing image scene dataset. In contrast, it is not only relatively easy to collect coarse and noisy labels but also inevitable to introduce label noise when collecting large-scale annotated data in the remote sensing scenario. Therefore, it is of great practical importance to robustly learn a superior CNN-based classification model from the remote sensing image scene dataset containing non-negligible or even significant error labels. To this end, this article proposes a new RSSC-oriented error-tolerant deep learning (RSSC-ETDL) approach to mitigate the adverse effect of incorrect labels of the remote sensing image scene dataset. In our proposed RSSC-ETDL method, learning multiview CNNs and correcting error labels are alternatively conducted in an iterative manner. It is noted that to make the alternative scheme work effectively, we propose a novel adaptive multifeature collaborative representation classifier (AMF-CRC) that benefits from adaptively combining multiple features of CNNs to correct the labels of uncertain samples. To quantitatively evaluate the performance of error-tolerant methods in the remote sensing domain, we construct remote sensing image scene datasets with: 1) simulated noisy labels by corrupting the open datasets with varying error rates and 2) real noisy labels by deploying the greedy annotation strategies that are practically used to accelerate the process of annotating remote sensing image scene datasets. Extensive experiments on these datasets demonstrate that our proposed RSSC-ETDL approach outperforms the state-of-the-art approaches.
Collapse
|
17
|
Identifying Spatiotemporal Patterns in Land Use and Cover Samples from Satellite Image Time Series. REMOTE SENSING 2021. [DOI: 10.3390/rs13050974] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The use of satellite image time series analysis and machine learning methods brings new opportunities and challenges for land use and cover changes (LUCC) mapping over large areas. One of these challenges is the need for samples that properly represent the high variability of land used and cover classes over large areas to train supervised machine learning methods and to produce accurate LUCC maps. This paper addresses this challenge and presents a method to identify spatiotemporal patterns in land use and cover samples to infer subclasses through the phenological and spectral information provided by satellite image time series. The proposed method uses self-organizing maps (SOMs) to reduce the data dimensionality creating primary clusters. From these primary clusters, it uses hierarchical clustering to create subclusters that recognize intra-class variability intrinsic to different regions and periods, mainly in large areas and multiple years. To show how the method works, we use MODIS image time series associated to samples of cropland and pasture classes over the Cerrado biome in Brazil. The results prove that the proposed method is suitable for identifying spatiotemporal patterns in land use and cover samples that can be used to infer subclasses, mainly for crop-types.
Collapse
|
18
|
Locating and Dating Land Cover Change Events in the Renosterveld, a Critically Endangered Shrubland Ecosystem. REMOTE SENSING 2021. [DOI: 10.3390/rs13050834] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Land cover change is the leading cause of global biodiversity decline. New satellite platforms allow for monitoring of habitats in increasingly fine detail, but most applications have been limited to forested ecosystems. I demonstrate the potential for detailed mapping and accurate dating of land cover change events in a highly biodiverse, Critically Endangered, shrubland ecosystem—the Renosterveld of South Africa. Using supervised classification of Sentinel 2 data, and subsequent manual verification with very high resolution imagery, I locate all conversion of Renosterveld to non-natural land cover between 2016 and 2020. Land cover change events are further assigned dates using high temporal frequency data from Planet labs. A total area of 478.6 hectares of Renosterveld loss was observed over this period, accounting for 0.72% of the remaining natural vegetation in the region. In total, 50% of change events were dated to within two weeks of their actual occurrence, and 87% to within two months. The Renosterveld loss identified here is almost entirely attributable to conversion of natural vegetation to cropland through ploughing. Change often preceded the planting and harvesting seasons of rainfed annual grains. These results show the potential for new satellite platforms to accurately map land cover change in non-forest ecosystems, and detect change within days of its occurrence. There is potential to use this and similar datasets to automate the process of change detection and monitor change continuously.
Collapse
|
19
|
Accuracy Improvements to Pixel-Based and Object-Based LULC Classification with Auxiliary Datasets from Google Earth Engine. REMOTE SENSING 2021. [DOI: 10.3390/rs13030453] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The monitoring and assessment of land use/land cover (LULC) change over large areas are significantly important in numerous research areas, such as natural resource protection, sustainable development, and climate change. However, accurately extracting LULC only using the spectral features of satellite images is difficult owing to landscape heterogeneities over large areas. To improve the accuracy of LULC classification, numerous studies have introduced other auxiliary features to the classification model. The Google Earth Engine (GEE) not only provides powerful computing capabilities, but also provides a large amount of remote sensing data and various auxiliary datasets. However, the different effects of various auxiliary datasets in the GEE on the improvement of the LULC classification accuracy need to be elucidated along with methods that can optimize combinations of auxiliary datasets for pixel- and object-based classification. Herein, we comprehensively analyze the performance of different auxiliary features in improving the accuracy of pixel- and object-based LULC classification models with medium resolution. We select the Yangtze River Delta in China as the study area and Landsat-8 OLI data as the main dataset. Six types of features, including spectral features, remote sensing multi-indices, topographic features, soil features, distance to the water source, and phenological features, are derived from auxiliary open-source datasets in GEE. We then examine the effect of auxiliary datasets on the improvement of the accuracy of seven pixels-based and seven object-based random forest classification models. The results show that regardless of the types of auxiliary features, the overall accuracy of the classification can be improved. The results further show that the object-based classification achieves higher overall accuracy compared to that obtained by the pixel-based classification. The best overall accuracy from the pixel-based (object-based) classification model is 94.20% (96.01%). The topographic features play the most important role in improving the overall accuracy of classification in the pixel- and object-based models comprising all features. Although a higher accuracy is achieved when the object-based method is used with only spectral data, small objects on the ground cannot be monitored. However, combined with many types of auxiliary features, the object-based method can identify small objects while also achieving greater accuracy. Thus, when applying object-based classification models to mid-resolution remote sensing images, different types of auxiliary features are required. Our research results improve the accuracy of LULC classification in the Yangtze River Delta and further provide a benchmark for other regions with large landscape heterogeneity.
Collapse
|
20
|
Wang J, Bai Y, Xia B. Simultaneous Diagnosis of Severity and Features of Diabetic Retinopathy in Fundus Photography Using Deep Learning. IEEE J Biomed Health Inform 2020; 24:3397-3407. [PMID: 32750975 DOI: 10.1109/jbhi.2020.3012547] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Deep learning methods for diabetic retinopathy (DR) diagnosis are usually criticized as being lack of interpretability in the diagnostic result, thus limiting their application in clinic. Simultaneous prediction of DR related features during the DR severity diagnosis is able to resolve this issue by providing supporting evidence (i.e. DR related features) for the diagnostic result (i.e. DR severity). In this study, we propose a hierarchical multi-task deep learning framework for simultaneous diagnosis of DR severity and DR related features in fundus images. A hierarchical structure is introduced to incorporate the casual relationship between DR related features and DR severity levels. In the experiments, the proposed approach was evaluated on two independent testing sets using quadratic weighted Cohen's kappa coefficient, receiver operating characteristic analysis, and precision-recall analysis. A grader study was also conducted to compare the performance of the proposed approach with those of general ophthalmologists with different levels of experience. The results demonstrate that the proposed approach could improve the performance for both DR severity diagnosis and DR related feature detection when comparing with the traditional deep learning-based methods. It achieves performance close to general ophthalmologists with five years of experience when diagnosing DR severity levels, and general ophthalmologists with ten years of experience for referable DR detection.
Collapse
|
21
|
Detection of Irrigated and Rainfed Crops in Temperate Areas Using Sentinel-1 and Sentinel-2 Time Series. REMOTE SENSING 2020. [DOI: 10.3390/rs12183044] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The detection of irrigated areas by means of remote sensing is essential to improve agricultural water resource management. Currently, data from the Sentinel constellation offer new possibilities for mapping irrigated areas at the plot scale. Until now, few studies have used Sentinel-1 (S1) and Sentinel-2 (S2) data to provide approaches for mapping irrigated plots in temperate areas. This study proposes a method for detecting irrigated and rainfed plots in a temperate area (southwestern France) jointly using optical (Sentinel-2), radar (Sentinel-1) and meteorological (SAFRAN) time series, through a classification algorithm. Monthly cumulative indices calculated from these satellite data were used in a Random Forest classifier. Two data years have been used, with different meteorological characteristics, allowing the performance of the method to be analysed under different climatic conditions. The combined use of the whole cumulative data (radar, optical and weather) improves the irrigated crop classifications (Overall Accuary (OA) ≈ 0.7) compared to the classifications obtained using each data separately (OA < 0.5). The use of monthly cumulative rainfall allows a significant improvement of the Fscore of irrigated and rainfed classes. Our study also reveals that the use of cumulative monthly indices leads to performances similar to those of the use of 10-day images while considerably reducing computational resources.
Collapse
|
22
|
Li M, Huang S, De Bock J, de Cooman G, Pižurica A. A Robust Dynamic Classifier Selection Approach for Hyperspectral Images with Imprecise Label Information. SENSORS 2020; 20:s20185262. [PMID: 32942592 PMCID: PMC7570993 DOI: 10.3390/s20185262] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 09/07/2020] [Accepted: 09/11/2020] [Indexed: 11/29/2022]
Abstract
Supervised hyperspectral image (HSI) classification relies on accurate label information. However, it is not always possible to collect perfectly accurate labels for training samples. This motivates the development of classifiers that are sufficiently robust to some reasonable amounts of errors in data labels. Despite the growing importance of this aspect, it has not been sufficiently studied in the literature yet. In this paper, we analyze the effect of erroneous sample labels on probability distributions of the principal components of HSIs, and provide in this way a statistical analysis of the resulting uncertainty in classifiers. Building on the theory of imprecise probabilities, we develop a novel robust dynamic classifier selection (R-DCS) model for data classification with erroneous labels. Particularly, spectral and spatial features are extracted from HSIs to construct two individual classifiers for the dynamic selection, respectively. The proposed R-DCS model is based on the robustness of the classifiers’ predictions: the extent to which a classifier can be altered without changing its prediction. We provide three possible selection strategies for the proposed model with different computational complexities and apply them on three benchmark data sets. Experimental results demonstrate that the proposed model outperforms the individual classifiers it selects from and is more robust to errors in labels compared to widely adopted approaches.
Collapse
Affiliation(s)
- Meizhu Li
- GAIM, Department of Telecommunications and Information Processing, Ghent University, 9000 Gent, Belgium;
- Correspondence: (M.L.); (S.H.)
| | - Shaoguang Huang
- GAIM, Department of Telecommunications and Information Processing, Ghent University, 9000 Gent, Belgium;
- Correspondence: (M.L.); (S.H.)
| | - Jasper De Bock
- FLip, Department of Electronics and Information Systems, Ghent University, 9052 Gent, Belgium; (J.D.B.); (G.d.C.)
| | - Gert de Cooman
- FLip, Department of Electronics and Information Systems, Ghent University, 9052 Gent, Belgium; (J.D.B.); (G.d.C.)
| | - Aleksandra Pižurica
- GAIM, Department of Telecommunications and Information Processing, Ghent University, 9000 Gent, Belgium;
| |
Collapse
|
23
|
A Comparison of Three Temporal Smoothing Algorithms to Improve Land Cover Classification: A Case Study from NEPAL. REMOTE SENSING 2020. [DOI: 10.3390/rs12182888] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Time series land cover data statistics often fluctuate abruptly due to seasonal impact and other noise in the input image. Temporal smoothing techniques are used to reduce the noise in time series data used in land cover mapping. The effects of smoothing may vary based on the smoothing method and land cover category. In this study, we compared the performance of Fourier transformation smoothing, Whittaker smoother and Linear-Fit averaging smoother on Landsat 5, 7 and 8 based yearly composites to classify land cover in Province No. 1 of Nepal. The performance of each smoother was tested based on whether it was applied on image composites or on land cover primitives generated using the random forest machine learning method. The land cover data used in the study was from the years 2000 to 2018. Probability distribution was examined to check the quality of primitives and accuracy of the final land cover maps were accessed. The best results were found for the Whittaker smoothing for stable classes and Fourier smoothing for other classes. The results also show that classification using a properly selected smoothing algorithm outperforms a classification based on its unsmoothed data set. The final land cover generated by combining the best results obtained from different smoothing approaches increased our overall land cover map accuracy from 79.18% to 83.44%. This study shows that smoothing can result in a substantial increase in the quality of the results and that the smoothing approach should be carefully considered for each land cover class.
Collapse
|
24
|
Zhang J, Yao Y, Suo N. Automatic classification of fine-scale mountain vegetation based on mountain altitudinal belt. PLoS One 2020; 15:e0238165. [PMID: 32841269 PMCID: PMC7447069 DOI: 10.1371/journal.pone.0238165] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 08/11/2020] [Indexed: 11/18/2022] Open
Abstract
Vegetation mapping is of considerable significance to both geoscience and mountain ecology, and the improved resolution of remote sensing images makes it possible to map vegetation at a finer scale. While the automatic classification of vegetation has gradually become a research hotspot, real-time and rapid collection of samples has become a bottleneck. How to achieve fine-scale classification and automatic sample selection at the same time needs further study. Stratified sampling based on appropriate prior knowledge is an effective sampling method for geospatial objects. Therefore, based on the idea of stratified sampling, this paper used the following three steps to realize the automatic selection of representative samples and classification of fine-scale mountain vegetation: 1) using Mountain Altitudinal Belt (MAB) distribution information to stratify the study area into multiple vegetation belts; 2) selecting and correcting samples through iterative clustering at each belt automatically; 3) using RF (Random Forest) classifier with strong robustness to achieve automatic classification. The average sample accuracy of nine vegetation formations was 0.933, and the total accuracy of the classification result was 92.2%, with the kappa coefficient of 0.910. The results showed that this method could automatically select high-quality samples and obtain a high-accuracy vegetation map. Compared with the traditional vegetation mapping method, this method greatly improved the efficiency, which is of great significance for the fine-scale mountain vegetation mapping in large-scale areas.
Collapse
Affiliation(s)
- Junyao Zhang
- Skate Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Science, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yonghui Yao
- Skate Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Science, Beijing, China
- * E-mail:
| | - Nandongzhu Suo
- Skate Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Science, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
25
|
Recursive Feature Elimination and Random Forest Classification of Natura 2000 Grasslands in Lowland River Valleys of Poland Based on Airborne Hyperspectral and LiDAR Data Fusion. REMOTE SENSING 2020. [DOI: 10.3390/rs12111842] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
The use of hyperspectral (HS) and LiDAR acquisitions has a great potential to enhance mapping and monitoring practices of endangered grasslands habitats, beyond conventional botanical field surveys. In this study we assess the potentiality of recursive feature elimination (RFE) in combination with random forest (RF) classification in extracting the main HS and LiDAR features needed to map selected Natura 2000 grasslands along Polish lowland river valleys, in particular alluvial meadows 6440, lowland hay meadows 6510, and xeric and calcareous grasslands 6120. We developed an automated RFE-RF system capable to combine the potentials of both techniques and applied it to multiple acquisitions. Several LiDAR-based products and different spectral indices (SI) were computed and used as input in the system, with the aim of shedding light on the best-to-use features. Results showed a remarkable increase in classification accuracy when LiDAR and SI products are added to the HS dataset, strengthening in particular the importance of employing LiDAR in combination with HS. Using only the 24 optimal features selection generalized over the three study areas, strongly linked to the highly heterogeneous characteristics of the habitats and landscapes investigated, it was possible to achieve rather high classification results (K around 0.7–0.77 and habitats F1 accuracy around 0.8–0.85), indicating that the selected Natura 2000 meadows and dry grasslands habitats can be automatically mapped by airborne HS and LiDAR data. Similar approaches might be considered for future monitoring activities in the context of habitats protection and conservation.
Collapse
|
26
|
The t-SNE Algorithm as a Tool to Improve the Quality of Reference Data Used in Accurate Mapping of Heterogeneous Non-Forest Vegetation. REMOTE SENSING 2019. [DOI: 10.3390/rs12010039] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Supervised classification methods, used for many applications, including vegetation mapping require accurate “ground truth” to be effective. Nevertheless, it is common for the quality of this data to be poorly verified prior to it being used for the training and validation of classification models. The fact that noisy or erroneous parts of the reference dataset are not removed is usually explained by the relatively high resistance of some algorithms to errors. The objective of this study was to demonstrate the rationale for cleaning the reference dataset used for the classification of heterogeneous non-forest vegetation, and to present a workflow based on the t-distributed stochastic neighbor embedding (t-SNE) algorithm for the better integration of reference data with remote sensing data in order to improve outcomes. The proposed analysis is a new application of the t-SNE algorithm. The effectiveness of this workflow was tested by classifying three heterogeneous non-forest Natura 2000 habitats: Molinia meadows (Molinion caeruleae; code 6410), species-rich Nardus grassland (code 6230) and dry heaths (code 4030), employing two commonly used algorithms: random forest (RF) and AdaBoost (AB), which, according to the literature, differ in their resistance to errors in reference datasets. Polygons collected in the field (on-ground reference data) in 2016 and 2017, containing no intentional errors, were used as the on-ground reference dataset. The remote sensing data used in the classification were obtained in 2017 during the peak growing season by a HySpex sensor consisting of two imaging spectrometers covering spectral ranges of 0.4–0.9 μm (VNIR-1800) and 0.9–2.5 μm (SWIR-384). The on-ground reference dataset was gradually cleaned by verifying candidate polygons selected by visual interpretation of t-SNE plots. Around 40–50% of candidate polygons were ultimately found to contain errors. Altogether, 15% of reference polygons were removed. As a result, the quality of the final map, as assessed by the Kappa and F1 accuracy measures as well as by visual evaluation, was significantly improved. The global map accuracy increased by about 6% (in Kappa coefficient), relative to the baseline classification obtained using random removal of the same number of reference polygons.
Collapse
|
27
|
Evaluation and comparison of eight machine learning models in land use/land cover mapping using Landsat 8 OLI: a case study of the northern region of Iran. SN APPLIED SCIENCES 2019. [DOI: 10.1007/s42452-019-1527-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
28
|
A Metric for Evaluating the Geometric Quality of Land Cover Maps Generated with Contextual Features from High-Dimensional Satellite Image Time Series without Dense Reference Data. REMOTE SENSING 2019. [DOI: 10.3390/rs11161929] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Land cover maps are a key resource for many studies in Earth Observation, and thanks to the high temporal, spatial, and spectral resolutions of systems like Sentinel-2, maps with a wide variety of land cover classes can now be automatically produced over vast areas. However, certain context-dependent classes, such as urban areas, remain challenging to classify correctly with pixel-based methods. Including contextual information into the classification can either be done at the feature level with texture descriptors or object-based approaches, or in the classification model itself, as is done in Convolutional Neural Networks. This improves recognition rates of these classes, but sometimes deteriorates the fine-resolution geometry of the output map, particularly in sharp corners and in fine elements such as rivers and roads. However, the quality of the geometry is difficult to assess in the absence of dense training data, which is usually the case in land cover mapping, especially over wide areas. This work presents a framework for measuring the geometric precision of a classification map, in order to provide deeper insight into the consequences of the use of various contextual features, when dense validation data is not available. This quantitative metric, named the Pixel Based Corner Match (PBCM), is based on corner detection and corner matching between a pixel-based classification result, and a contextual classification result. The selected case study is the classification of Sentinel-2 multi-spectral image time series, with a rich nomenclature containing context-dependent classes. To demonstrate the added value of the proposed metric, three spatial support shapes (window, object, superpixel) are compared according to their ability to improve the classification performance on this challenging problem, while paying attention to the geometric precision of the result. The results show that superpixels are the best candidate for the local statistics features, as they modestly improve the classification accuracy, while preserving the geometric elements in the image. Furthermore, the density of edges in a sliding window provides a significant boost in accuracy, and maintains a high geometric precision.
Collapse
|
29
|
Using Airborne Hyperspectral Imaging Spectroscopy to Accurately Monitor Invasive and Expansive Herb Plants: Limitations and Requirements of the Method. SENSORS 2019; 19:s19132871. [PMID: 31261669 PMCID: PMC6651360 DOI: 10.3390/s19132871] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 06/21/2019] [Accepted: 06/25/2019] [Indexed: 11/24/2022]
Abstract
Remote sensing (RS) is currently regarded as one of the standard tools used for mapping invasive and expansive plants for scientific purposes and it is increasingly widely used in nature conservation management. The applicability of RS methods is determined by its limitations and requirements. One of the most important limitations is the species percentage cover at which the classification result is correct and useful for nature conservation. The primary objective, carried out in 2017 in three areas of Poland, was to determine the minimum percentage cover from which it is possible to identify a target species by RS methods. A secondary objective of this research, related to the requirements of the method, was to optimize the set of training polygons for a target species in terms of the number of polygons and abundance percentage cover of the target species. Our method has to be easy to use, effective, and applicable, therefore the analysis was carried out using the basic set of rasters—the first 30 channels after the Minimum Noise Fraction (MNF) transformation (the mosaic of hyperspectral data from HySpex sensors with spectral range 0.4–2.5 µm) and commonly used Random Forest algorithm. The analysis used airborne hyperspectral data with a spatial resolution of 1 m to perform classification of one invasive and three expansive plants—two grasses and two large perennials. On-ground training and validation data sets were collected simultaneously with airborne data collection. When testing different classification scenarios, only the set of training polygons for a target species was changed. Classification results were evaluated based on three methods: accuracy measures (Kappa and F1), true-positive pixels in subclasses with different species cover and compatibility with field mapping. The classification results indicate that to classify the target plant species at the accepted level, the training dataset should contain polygons with a species cover ranging from 80–100%. Training performed only using polygons with a species characterized by a variable, but lower, cover (20–70%) and missing samples in the 80–100% range, led to a map which was not acceptable because of a high overestimation of target species. We achieved effective identification of species in areas where the species cover is above 50%, considering that ecosystems are heterogeneous. The results of these studies developed a methodology of field data acquisition and the necessity of synchronization in the acquisition of airborne data, and training and validation of on-ground sampling.
Collapse
|
30
|
Label Noise Cleansing with Sparse Graph for Hyperspectral Image Classification. REMOTE SENSING 2019. [DOI: 10.3390/rs11091116] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In a real hyperspectral image classification task, label noise inevitably exists in training samples. To deal with label noise, current methods assume that noise obeys the Gaussian distribution, which is not the real case in practice, because in most cases, we are more likely to misclassify training samples at the boundaries between different classes. In this paper, we propose a spectral–spatial sparse graph-based adaptive label propagation (SALP) algorithm to address a more practical case, where the label information is contaminated by random noise and boundary noise. Specifically, the SALP mainly includes two steps: First, a spectral–spatial sparse graph is constructed to depict the contextual correlations between pixels within the same superpixel homogeneous region, which are generated by superpixel image segmentation, and then a transfer matrix is produced to describe the transition probability between pixels. Second, after randomly splitting training pixels into “clean” and “polluted,” we iteratively propagate the label information from “clean” to “polluted” based on the transfer matrix, and the relabeling strategy for each pixel is adaptively adjusted along with its spatial position in the corresponding homogeneous region. Experimental results on two standard hyperspectral image datasets show that the proposed SALP over four major classifiers can significantly decrease the influence of noisy labels, and our method achieves better performance compared with the baselines.
Collapse
|
31
|
Assessment of Optimal Transport for Operational Land-Cover Mapping Using High-Resolution Satellite Images Time Series without Reference Data of the Mapping Period. REMOTE SENSING 2019. [DOI: 10.3390/rs11091047] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Land-cover map production using remote-sensing imagery is governed by data availability. In our case, data sources are two-fold: on one hand, optical data provided regularly by satellites such as Sentinel-2, and on the other hand, reference data which allow calibrating mapping methods or validating the results. The lengthy delays due to reference data collection and cleansing are one of the main issues for applications. In this work, the use of Optimal Transport (OT) is proposed. OT is a Domain Adaptation method that uses past data, both images and reference data, to produce the land-cover map of the current period without updated reference data. Seven years of Formosat-2 image time series and the corresponding reference data are used to evaluate two OT algorithms: conventional EMD transport and regularized transport based on the Sinkhorn distance. The contribution of OT to a classification fusion strategy is also evaluated. The results show that with a 17-class nomenclature the problem is too complex for the Sinkhorn algorithm, which provides maps with an Overall Accuracy (OA) of 30%. In contrast, with the EMD algorithm, an OA close to 70% is obtained. One limitation of OT is the number of classes that can be considered at the same time. Simplification schemes are proposed to reduce the number of classes to be transported. Cases of improvement are shown when the problem is simplified, with an improvement in OA varying from 5% and 20%, producing maps with an OA near 79%. As several years are available, the OT approaches are compared to standard fusion schemes, like majority voting. The gain in voting strategies with OT use is lower than the gain obtained with standard majority voting (around 5%).
Collapse
|
32
|
Evaluation of Sentinel-1 and 2 Time Series for Land Cover Classification of Forest–Agriculture Mosaics in Temperate and Tropical Landscapes. REMOTE SENSING 2019. [DOI: 10.3390/rs11080979] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Monitoring forest–agriculture mosaics is crucial for understanding landscape heterogeneity and managing biodiversity. Mapping these mosaics from remotely sensed imagery remains challenging, since ecological gradients from forested to agricultural areas make characterizing vegetation more difficult. The recent synthetic aperture radar (SAR) Sentinel-1 (S-1) and optical Sentinel-2 (S-2) time series provide a great opportunity to monitor forest–agriculture mosaics due to their high spatial and temporal resolutions. However, while a few studies have used the temporal resolution of S-2 time series alone to map land cover and land use in cropland and/or forested areas, S-1 time series have not yet been investigated alone for this purpose. The combined use of S-1 & S-2 time series has been assessed for only one or a few land cover classes. In this study, we assessed the potential of S-1 data alone, S-2 data alone, and their combined use for mapping forest–agriculture mosaics over two study areas: a temperate mountainous landscape in the Cantabrian Range (Spain) and a tropical forested landscape in Paragominas (Brazil). Satellite images were classified using an incremental procedure based on an importance rank of the input features. The classifications obtained with S-2 data alone (mean kappa index = 0.59–0.83) were more accurate than those obtained with S-1 data alone (mean kappa index = 0.28–0.72). Accuracy increased when combining S-1 and 2 data (mean kappa index = 0.55–0.85). The method enables defining the number and type of features that discriminate land cover classes in an optimal manner according to the type of landscape considered. The best configuration for the Spanish and Brazilian study areas included 5 and 10 features, respectively, for S-2 data alone and 10 and 20 features, respectively, for S-1 data alone. Short-wave infrared and VV and VH polarizations were key features of S-2 and S-1 data, respectively. In addition, the method enables defining key periods that discriminate land cover classes according to the type of images used. For example, in the Cantabrian Range, winter and summer were key for S-2 time series, while spring and winter were key for S-1 time series.
Collapse
|
33
|
Multiple Flights or Single Flight Instrument Fusion of Hyperspectral and ALS Data? A Comparison of their Performance for Vegetation Mapping. REMOTE SENSING 2019. [DOI: 10.3390/rs11080970] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Fusion of remote sensing data often improves vegetation mapping, compared to using data from only a single source. The effectiveness of this fusion is subject to many factors, including the type of data, collection method, and purpose of the analysis. In this study, we compare the usefulness of hyperspectral (HS) and Airborne Laser System (ALS) data fusion acquired in separate flights, Multiple Flights Data Fusion (MFDF), and during a single flight through Instrument Fusion (IF) for the classification of non-forest vegetation. An area of 6.75 km2 was selected, where hyperspectral and ALS data was collected during two flights in 2015 and one flight in 2017. This data was used to classify three non-forest Natura 2000 habitats i.e., Xeric sand calcareous grasslands (code 6120), alluvial meadows of river valleys of the Cnidion dubii (code 6440), species-rich Nardus grasslands (code 6230) using a Random Forest classifier. Our findings show that it is not possible to determine which sensor, HS, or ALS used independently leads to a higher classification accuracy for investigated Natura 2000 habitats. Concurrently, increased stability and consistency of classification results was confirmed, regardless of the type of fusion used; IF, MFDF and varied information relevance of single sensor data. The research shows that the manner of data collection, using MFDF or IF, does not determine the level of relevance of ALS or HS data. The analysis of fusion effectiveness, gauged as the accuracy of the classification result and time consumed for data collection, has shown a superiority of IF over MFDF. IF delivered classification results that are more accurate compared to MFDF. IF is always cheaper than MFDF and the difference in effectiveness of both methods becomes more pronounced when the area of aerial data collection becomes larger.
Collapse
|
34
|
Gong P, Liu H, Zhang M, Li C, Wang J, Huang H, Clinton N, Ji L, Li W, Bai Y, Chen B, Xu B, Zhu Z, Yuan C, Ping Suen H, Guo J, Xu N, Li W, Zhao Y, Yang J, Yu C, Wang X, Fu H, Yu L, Dronova I, Hui F, Cheng X, Shi X, Xiao F, Liu Q, Song L. Stable classification with limited sample: transferring a 30-m resolution sample set collected in 2015 to mapping 10-m resolution global land cover in 2017. Sci Bull (Beijing) 2019; 64:370-373. [PMID: 36659725 DOI: 10.1016/j.scib.2019.03.002] [Citation(s) in RCA: 238] [Impact Index Per Article: 47.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 02/26/2019] [Accepted: 02/27/2019] [Indexed: 01/21/2023]
Affiliation(s)
- Peng Gong
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China; AI for Earth Lab, Cross-Strait Institute, Tsinghua University, Beijing 100084, China.
| | - Han Liu
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Meinan Zhang
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Congcong Li
- Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA
| | - Jie Wang
- AI for Earth Lab, Cross-Strait Institute, Tsinghua University, Beijing 100084, China; State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, China.
| | - Huabing Huang
- State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, China.
| | | | - Luyan Ji
- Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China
| | - Wenyu Li
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Yuqi Bai
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Bin Chen
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Bing Xu
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Zhiliang Zhu
- United States Geological Survey, Reston, VA 20192, USA
| | - Cui Yuan
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Hoi Ping Suen
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Jing Guo
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Nan Xu
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Weijia Li
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Yuanyuan Zhao
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Jun Yang
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Chaoqing Yu
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China; AI for Earth Lab, Cross-Strait Institute, Tsinghua University, Beijing 100084, China
| | - Xi Wang
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China; AI for Earth Lab, Cross-Strait Institute, Tsinghua University, Beijing 100084, China
| | - Haohuan Fu
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China; National Supercomputing Center in Wuxi, Wuxi 214072, China
| | - Le Yu
- Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Iryna Dronova
- Department of Landscape Architecture, University of California, Berkeley, CA 94720, USA
| | - Fengming Hui
- State Key Laboratory of Remote Sensing Science, College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
| | - Xiao Cheng
- State Key Laboratory of Remote Sensing Science, College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
| | - Xueli Shi
- National Climate Center, China Meteorological Administration, Beijing 100081, China
| | - Fengjin Xiao
- National Climate Center, China Meteorological Administration, Beijing 100081, China
| | - Qiufeng Liu
- National Climate Center, China Meteorological Administration, Beijing 100081, China
| | - Lianchun Song
- National Climate Center, China Meteorological Administration, Beijing 100081, China
| |
Collapse
|
35
|
Temporal Convolutional Neural Network for the Classification of Satellite Image Time Series. REMOTE SENSING 2019. [DOI: 10.3390/rs11050523] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Latest remote sensing sensors are capable of acquiring high spatial and spectral Satellite Image Time Series (SITS) of the world. These image series are a key component of classification systems that aim at obtaining up-to-date and accurate land cover maps of the Earth’s surfaces. More specifically, current SITS combine high temporal, spectral and spatial resolutions, which makes it possible to closely monitor vegetation dynamics. Although traditional classification algorithms, such as Random Forest (RF), have been successfully applied to create land cover maps from SITS, these algorithms do not make the most of the temporal domain. This paper proposes a comprehensive study of Temporal Convolutional Neural Networks (TempCNNs), a deep learning approach which applies convolutions in the temporal dimension in order to automatically learn temporal (and spectral) features. The goal of this paper is to quantitatively and qualitatively evaluate the contribution of TempCNNs for SITS classification, as compared to RF and Recurrent Neural Networks (RNNs) —a standard deep learning approach that is particularly suited to temporal data. We carry out experiments on Formosat-2 scene with 46 images and one million labelled time series. The experimental results show that TempCNNs are more accurate than the current state of the art for SITS classification. We provide some general guidelines on the network architecture, common regularization mechanisms, and hyper-parameter values such as batch size; we also draw out some differences with standard results in computer vision (e.g., about pooling layers). Finally, we assess the visual quality of the land cover maps produced by TempCNNs.
Collapse
|
36
|
In-Season Mapping of Irrigated Crops Using Landsat 8 and Sentinel-1 Time Series. REMOTE SENSING 2019. [DOI: 10.3390/rs11020118] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Numerous studies have reported the use of multi-spectral and multi-temporal remote sensing images to map irrigated crops. Such maps are useful for water management. The recent availability of optical and radar image time series such as the Sentinel data offers new opportunities to map land cover with high spatial and temporal resolutions. Early identification of irrigated crops is of major importance for irrigation scheduling, but the cloud coverage might significantly reduce the number of available optical images, making crop identification difficult. SAR image time series such as those provided by Sentinel-1 offer the possibility of improving early crop mapping. This paper studies the impact of the Sentinel-1 images when used jointly with optical imagery (Landsat8) and a digital elevation model of the Shuttle Radar Topography Mission (SRTM). The study site is located in a temperate zone (southwest France) with irrigated maize crops. The classifier used is the Random Forest. The combined use of the different data (radar, optical, and SRTM) improves the early classifications of the irrigated crops (k = 0.89) compared to classifications obtained using each type of data separately (k = 0.84). The use of the DEM is significant for the early stages but becomes useless once crops have reached their full development. In conclusion, compared to a “full optical” approach, the “combined” method is more robust over time as radar images permit cloudy conditions to be overcome.
Collapse
|
37
|
Fusing High-Spatial-Resolution Remotely Sensed Imagery and OpenStreetMap Data for Land Cover Classification Over Urban Areas. REMOTE SENSING 2019. [DOI: 10.3390/rs11010088] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Land cover classification of urban areas is critical for understanding the urban environment. High-resolution remotely sensed imagery provides abundant, detailed spatial information for urban classification. In the meantime, OpenStreetMap (OSM) data, as typical crowd-sourced geographical information, have been an emerging data source for obtaining urban information. In this context, a land cover classification method that fuses high-resolution remotely sensed imagery and OSM data is proposed. Training samples were generated by integrating the OSM data and multiple information indexes. OSM data, which contain class attributes and location information of urban objects, served as the labels of initial training samples. Multiple information indexes that reflect spectral and spatial characteristics of different classes were utilized to improve the training set. Morphological attribute profiles were used because the structural and contextual information of images was effective in distinguishing the classes with similar spectral characteristics. Moreover, a road superimposition strategy that considers road hierarchy was developed because OSM data provide road information with high completeness in the urban area. Experiments were conducted on the data captured over Wuhan city, and three state-of-the-art approaches were adopted for comparison. Results show that the proposed approach obtains satisfactory results and outperforms the other comparative approaches.
Collapse
|
38
|
|
39
|
Detailed Land Cover Mapping from Multitemporal Landsat-8 Data of Different Cloud Cover. REMOTE SENSING 2018. [DOI: 10.3390/rs10081214] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Detailed, accurate and frequent land cover mapping is a prerequisite for several important geospatial applications and the fulfilment of current sustainable development goals. This paper introduces a methodology for the classification of annual high-resolution satellite data into several detailed land cover classes. In particular, a nomenclature with 27 different classes was introduced based on CORINE Land Cover (CLC) Level-3 categories and further analysing various crop types. Without employing cloud masks and/or interpolation procedures, we formed experimental datasets of Landsat-8 (L8) images with gradually increased cloud cover in order to assess the influence of cloud presence on the reference data and the resulting classification accuracy. The performance of shallow kernel-based and deep patch-based machine learning classification frameworks was evaluated. Quantitatively, the resulting overall accuracy rates differed within a range of less than 3%; however, maps produced based on Support Vector Machines (SVM) were more accurate across class boundaries and the respective framework was less computationally expensive compared to the applied patch-based deep Convolutional Neural Network (CNN). Further experimental results and analysis indicated that employing all multitemporal images with up to 30% cloud cover delivered relatively higher overall accuracy rates as well as the highest per-class accuracy rates. Moreover, by selecting 70% of the top-ranked features after applying a feature selection strategy, slightly higher accuracy rates were achieved. A detailed discussion of the quantitative and qualitative evaluation outcomes further elaborates on the performance of all considered classes and highlights different aspects of their spectral behaviour and separability.
Collapse
|
40
|
Scalable Parcel-Based Crop Identification Scheme Using Sentinel-2 Data Time-Series for the Monitoring of the Common Agricultural Policy. REMOTE SENSING 2018. [DOI: 10.3390/rs10060911] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
41
|
Fusion Approaches for Land Cover Map Production Using High Resolution Image Time Series without Reference Data of the Corresponding Period. REMOTE SENSING 2017. [DOI: 10.3390/rs9111151] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
42
|
Self-Learning Based Land-Cover Classification Using Sequential Class Patterns from Past Land-Cover Maps. REMOTE SENSING 2017. [DOI: 10.3390/rs9090921] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
43
|
Waldner F, Hansen MC, Potapov PV, Löw F, Newby T, Ferreira S, Defourny P. National-scale cropland mapping based on spectral-temporal features and outdated land cover information. PLoS One 2017; 12:e0181911. [PMID: 28817618 PMCID: PMC5560701 DOI: 10.1371/journal.pone.0181911] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 07/10/2017] [Indexed: 11/19/2022] Open
Abstract
The lack of sufficient ground truth data has always constrained supervised learning, thereby hindering the generation of up-to-date satellite-derived thematic maps. This is all the more true for those applications requiring frequent updates over large areas such as cropland mapping. Therefore, we present a method enabling the automated production of spatially consistent cropland maps at the national scale, based on spectral-temporal features and outdated land cover information. Following an unsupervised approach, this method extracts reliable calibration pixels based on their labels in the outdated map and their spectral signatures. To ensure spatial consistency and coherence in the map, we first propose to generate seamless input images by normalizing the time series and deriving spectral-temporal features that target salient cropland characteristics. Second, we reduce the spatial variability of the class signatures by stratifying the country and by classifying each stratum independently. Finally, we remove speckle with a weighted majority filter accounting for per-pixel classification confidence. Capitalizing on a wall-to-wall validation data set, the method was tested in South Africa using a 16-year old land cover map and multi-sensor Landsat time series. The overall accuracy of the resulting cropland map reached 92%. A spatially explicit validation revealed large variations across the country and suggests that intensive grain-growing areas were better characterized than smallholder farming systems. Informative features in the classification process vary from one stratum to another but features targeting the minimum of vegetation as well as short-wave infrared features were consistently important throughout the country. Overall, the approach showed potential for routinely delivering consistent cropland maps over large areas as required for operational crop monitoring.
Collapse
Affiliation(s)
- François Waldner
- Université catholique de Louvain, Earth and Life Institute-Environmental Sciences, 2 Croix du Sud, 1348 Louvain-la-Neuve, Belgium
- * E-mail:
| | - Matthew C. Hansen
- Department of Geographical Sciences, University of Maryland, 4321 Hartwick Road, College Park, Maryland, United States of America
| | - Peter V. Potapov
- Department of Geographical Sciences, University of Maryland, 4321 Hartwick Road, College Park, Maryland, United States of America
| | - Fabian Löw
- MapTailor Geospatial Consulting GbR, 53113 Bonn, Germany
| | - Terence Newby
- Agricultural Research Council, Private Bag X79, 0001 Pretoria, South Africa
| | | | - Pierre Defourny
- Université catholique de Louvain, Earth and Life Institute-Environmental Sciences, 2 Croix du Sud, 1348 Louvain-la-Neuve, Belgium
| |
Collapse
|
44
|
High-Resolution Vegetation Mapping in Japan by Combining Sentinel-2 and Landsat 8 Based Multi-Temporal Datasets through Machine Learning and Cross-Validation Approach. LAND 2017. [DOI: 10.3390/land6030050] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|