1
|
Hao M, Dong X, Jiang D, Yu X, Ding F, Zhuo J. Land-use classification based on high-resolution remote sensing imagery and deep learning models. PLoS One 2024; 19:e0300473. [PMID: 38635663 PMCID: PMC11025814 DOI: 10.1371/journal.pone.0300473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 02/03/2024] [Indexed: 04/20/2024] Open
Abstract
High-resolution imagery and deep learning models have gained increasing importance in land-use mapping. In recent years, several new deep learning network modeling methods have surfaced. However, there has been a lack of a clear understanding of the performance of these models. In this study, we applied four well-established and robust deep learning models (FCN-8s, SegNet, U-Net, and Swin-UNet) to an open benchmark high-resolution remote sensing dataset to compare their performance in land-use mapping. The results indicate that FCN-8s, SegNet, U-Net, and Swin-UNet achieved overall accuracies of 80.73%, 89.86%, 91.90%, and 96.01%, respectively, on the test set. Furthermore, we assessed the generalization ability of these models using two measures: intersection of union and F1 score, which highlight Swin-UNet's superior robustness compared to the other three models. In summary, our study provides a systematic analysis of the classification differences among these four deep learning models through experiments. It serves as a valuable reference for selecting models in future research, particularly in scenarios such as land-use mapping, urban functional area recognition, and natural resource management.
Collapse
Affiliation(s)
- Mengmeng Hao
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaohan Dong
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
| | - Dong Jiang
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
| | | | - Fangyu Ding
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
| | - Jun Zhuo
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
2
|
Li X, Li Z, Qiu H, Chen G, Fan P. Soil carbon content prediction using multi-source data feature fusion of deep learning based on spectral and hyperspectral images. CHEMOSPHERE 2023; 336:139161. [PMID: 37302502 DOI: 10.1016/j.chemosphere.2023.139161] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 06/05/2023] [Accepted: 06/06/2023] [Indexed: 06/13/2023]
Abstract
Visible near-infrared reflectance spectroscopy (VNIR) and hyperspectral images (HSI) have their respective advantages in soil carbon content prediction, and the effective fusion of VNIR and HSI is of great significance for improving the prediction accuracy. But the contribution difference analysis of multiple features in the multi-source data is inadequate, and there is a lack of in-depth research on the contribution difference analysis of artificial feature and deep learning feature. In order to solve the problem, soil carbon content prediction methods based on VNIR and HSI multi-source data feature fusion are proposed. The multi-source data fusion network under the attention mechanism and the multi-source data fusion network with artificial feature are designed. For the multi-source data fusion network based on the attention mechanism, the information are fused through the attention mechanism according to the contribution difference of each feature. For the other network, artificial feature are introduced to fuse multi-source data. The results show that multi-source data fusion network based on the attention mechanism can improve the prediction accuracy of soil carbon content, and multi-source data fusion network combined with artificial feature has better prediction effect. Compared with two single-source data from the VNIR and HSI, the relative percent deviation of Neilu, Aoshan Bay and Jiaozhou Bay based on multi-source data fusion network combined with artificial feature are increased by 56.81% and 149.18%, 24.28% and 43.96%, 31.16% and 28.73% respectively. This study can effectively solve the problem of the deep fusion of multiple features in the soil carbon content prediction by VNIR and HSI, so as to improve the accuracy and stability of soil carbon content prediction, promote the application and development of soil carbon content prediction in spectral and hyperspectral image, and provide technical support for the study of carbon cycle and the carbon sink.
Collapse
Affiliation(s)
- Xueying Li
- Institute of Oceanographic Instrumentation, Qilu University of Technology (Shandong Academy of Sciences), Qingdao, 266061, China; College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266590, China
| | - Zongmin Li
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266590, China
| | - Huimin Qiu
- Institute of Oceanographic Instrumentation, Qilu University of Technology (Shandong Academy of Sciences), Qingdao, 266061, China
| | - Guangyuan Chen
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Pingping Fan
- Institute of Oceanographic Instrumentation, Qilu University of Technology (Shandong Academy of Sciences), Qingdao, 266061, China.
| |
Collapse
|
3
|
DU-Net-Cloud: a smart cloud-edge application with an attention mechanism and U-Net for remote sensing images and processing. JOURNAL OF CLOUD COMPUTING 2023. [DOI: 10.1186/s13677-023-00403-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
AbstractIn recent ages, the use of deep learning approaches to extract ground object information from remote sensing high-resolution images has attracted extensive attention in many fields. Nevertheless, due to the high similarity of features between roads, prevailing deep learning semantic segmentation networks commonly demonstrate reduced continuity in road segmentation. Besides this, the role of advanced computing technologies including cloud and edge infrastructures has also become very important due to large number of images and their storage requirements. In order to better study the road details in images related to remote sensing, this paper suggests a road extraction technique which is basically founded on Dimensional U-Net (DU-Net) network. At the deepening level of the U-Net network, a parallel attention mechanism, known as ProCBAM, is added and implemented to the feature transmission step of the classical U-Net network. Moreover, we use and implement the edge-cloud architecture to develop and construct a unique remote sensing image service system that integrates several datacenters and their related edge infrastructure. In the proposed system, the edge network is primarily used for caching and distributing the processed remote sensing images, while the remote datacenter serves as the cloud platform and is responsible for the storage and processing of original remote sensing images. The results show that the proposed cloud enabled DU-Net model has achieved good performance in road segmentation. We observed that it can achieve improved road segmentation and resolve the issue of reduced continuity of road segmentation when compared with other state-of-the-art learning networks. Moreover, our empirical evaluations suggest that the proposed system not only distributes the workload of processing tasks across the edges but also achieves data efficiency among them, which enhances image processing efficiency and reduces data transmission costs.
Collapse
|
4
|
Transferable Deep Learning from Time Series of Landsat Data for National Land-Cover Mapping with Noisy Labels: A Case Study of China. REMOTE SENSING 2021. [DOI: 10.3390/rs13214194] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Large-scale land-cover classification using a supervised algorithm is a challenging task. Enormous efforts have been made to manually process and check the production of national land-cover maps. This has led to complex pre- and post-processing and even the production of inaccurate mapping products from large-scale remote sensing images. Inspired by the recent success of deep learning techniques, in this study we provided a feasible automatic solution for improving the quality of national land-cover maps. However, the application of deep learning to national land-cover mapping remains limited because only small-scale noisy labels are available. To this end, a mutual transfer network MTNet was developed. MTNet is capable of learning better feature representations by mutually transferring pre-trained models from time-series of data and fine-tuning current data. An interactive training strategy such as this can effectively alleviate the effects of inaccurate or noisy labels and unbalanced sample distributions, thus yielding a relatively stable classification system. Extensive experiments were conducted by focusing on several representative regions to evaluate the classification results of our proposed method. Quantitative results showed that the proposed MTNet outperformed its baseline model about 1%, and the accuracy can be improved up to 6.45% compared with the model trained by the training set of another year. We also visualized the national classification maps generated by MTNet for two different time periods to quantitatively analyze the performance gain. It was concluded that the proposed MTNet provides an efficient method for large-scale land cover mapping.
Collapse
|
5
|
Effect of Attention Mechanism in Deep Learning-Based Remote Sensing Image Processing: A Systematic Literature Review. REMOTE SENSING 2021. [DOI: 10.3390/rs13152965] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Machine learning, particularly deep learning (DL), has become a central and state-of-the-art method for several computer vision applications and remote sensing (RS) image processing. Researchers are continually trying to improve the performance of the DL methods by developing new architectural designs of the networks and/or developing new techniques, such as attention mechanisms. Since the attention mechanism has been proposed, regardless of its type, it has been increasingly used for diverse RS applications to improve the performances of the existing DL methods. However, these methods are scattered over different studies impeding the selection and application of the feasible approaches. This study provides an overview of the developed attention mechanisms and how to integrate them with different deep learning neural network architectures. In addition, it aims to investigate the effect of the attention mechanism on deep learning-based RS image processing. We identified and analyzed the advances in the corresponding attention mechanism-based deep learning (At-DL) methods. A systematic literature review was performed to identify the trends in publications, publishers, improved DL methods, data types used, attention types used, overall accuracies achieved using At-DL methods, and extracted the current research directions, weaknesses, and open problems to provide insights and recommendations for future studies. For this, five main research questions were formulated to extract the required data and information from the literature. Furthermore, we categorized the papers regarding the addressed RS image processing tasks (e.g., image classification, object detection, and change detection) and discussed the results within each group. In total, 270 papers were retrieved, of which 176 papers were selected according to the defined exclusion criteria for further analysis and detailed review. The results reveal that most of the papers reported an increase in overall accuracy when using the attention mechanism within the DL methods for image classification, image segmentation, change detection, and object detection using remote sensing images.
Collapse
|
6
|
Transferability of Convolutional Neural Network Models for Identifying Damaged Buildings Due to Earthquake. REMOTE SENSING 2021. [DOI: 10.3390/rs13030504] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The collapse of buildings caused by earthquakes can lead to a large loss of life and property. Rapid assessment of building damage with remote sensing image data can support emergency rescues. However, current studies indicate that only a limited sample set can usually be obtained from remote sensing images immediately following an earthquake. Consequently, the difficulty in preparing sufficient training samples constrains the generalization of the model in the identification of earthquake-damaged buildings. To produce a deep learning network model with strong generalization, this study adjusted four Convolutional Neural Network (CNN) models for extracting damaged building information and compared their performance. A sample dataset of damaged buildings was constructed by using multiple disaster images retrieved from the xBD dataset. Using satellite and aerial remote sensing data obtained after the 2008 Wenchuan earthquake, we examined the geographic and data transferability of the deep network model pre-trained on the xBD dataset. The result shows that the network model pre-trained with samples generated from multiple disaster remote sensing images can extract accurately collapsed building information from satellite remote sensing data. Among the adjusted CNN models tested in the study, the adjusted DenseNet121 was the most robust. Transfer learning solved the problem of poor adaptability of the network model to remote sensing images acquired by different platforms and could identify disaster-damaged buildings properly. These results provide a solution to the rapid extraction of earthquake-damaged building information based on a deep learning network model.
Collapse
|
7
|
Chen C, Huang K, Li D, Zhao Z, Hong J. Multi-Segmentation Parallel CNN Model for Estimating Assembly Torque Using Surface Electromyography Signals. SENSORS 2020; 20:s20154213. [PMID: 32751213 PMCID: PMC7435780 DOI: 10.3390/s20154213] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 07/14/2020] [Accepted: 07/20/2020] [Indexed: 11/16/2022]
Abstract
The precise application of tightening torque is one of the important measures to ensure accurate bolt connection and improvement in product assembly quality. Currently, due to the limited assembly space and efficiency, a wrench without the function of torque measurement is still an extensively used assembly tool. Therefore, wrench torque monitoring is one of the urgent problems that needs to be solved. This study proposes a multi-segmentation parallel convolution neural network (MSP-CNN) model for estimating assembly torque using surface electromyography (sEMG) signals, which is a method of torque monitoring through classification methods. The MSP-CNN model contains two independent CNN models with different or offset torque granularities, and their outputs are fused to obtain a finer classification granularity, thus improving the accuracy of torque estimation. First, a bolt tightening test bench is established to collect sEMG signals and tightening torque signals generated when the operator tightens various bolts using a wrench. Second, the sEMG and torque signals are preprocessed to generate the sEMG signal graphs. The range of the torque transducer is divided into several equal subdivision ranges according to different or offset granularities, and each subdivision range is used as a torque label for each torque signal. Then, the training set, verification set, and test set are established for torque monitoring to train the MSP-CNN model. The effects of different signal preprocessing methods, torque subdivision granularities, and pooling methods on the recognition accuracy and torque monitoring accuracy of a single CNN network are compared experimentally. The results show that compared to maximum pooling, average pooling can improve the accuracy of CNN torque classification and recognition. Moreover, the MSP-CNN model can improve the accuracy of torque monitoring as well as solve the problems of non-convergence and slow convergence of independent CNN network models.
Collapse
Affiliation(s)
- Chengjun Chen
- School of Mechanical and Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China; (K.H.); (D.L.); (Z.Z.)
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
- Correspondence: ; Tel.: +86-532-6805-2755
| | - Kai Huang
- School of Mechanical and Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China; (K.H.); (D.L.); (Z.Z.)
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
| | - Dongnian Li
- School of Mechanical and Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China; (K.H.); (D.L.); (Z.Z.)
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
| | - Zhengxu Zhao
- School of Mechanical and Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China; (K.H.); (D.L.); (Z.Z.)
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
| | - Jun Hong
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an 710049, China;
| |
Collapse
|
8
|
Different Spectral Domain Transformation for Land Cover Classification Using Convolutional Neural Networks with Multi-Temporal Satellite Imagery. REMOTE SENSING 2020. [DOI: 10.3390/rs12071097] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This study compares some different types of spectral domain transformations for convolutional neural network (CNN)-based land cover classification. A novel approach was proposed, which transforms one-dimensional (1-D) spectral vectors into two-dimensional (2-D) features: Polygon graph images (CNN-Polygon) and 2-D matrices (CNN-Matrix). The motivations of this study are that (1) the shape of the converted 2-D images is more intuitive for human eyes to interpret when compared to 1-D spectral input; and (2) CNNs are highly specialized and may be able to similarly utilize this information for land cover classification. Four seasonal Landsat 8 images over three study areas—Lake Tapps, Washington, Concord, New Hampshire, USA, and Gwangju, Korea—were used to evaluate the proposed approach for nine land cover classes compared to several other methods: Random forest (RF), support vector machine (SVM), 1-D CNN, and patch-based CNN. Oversampling and undersampling approaches were conducted to examine the effect of the sample size on the model performance. The CNN-Polygon had better performance than the other methods, with overall accuracies of about 93%–95 % for both Concord and Lake Tapps and 80%–84% for Gwangju. The CNN-Polygon particularly performed well when the training sample size was small, less than 200 per class, while the CNN-Matrix resulted in similar or higher performance as sample sizes became larger. The contributing input variables to the models were carefully analyzed through sensitivity analysis based on occlusion maps and accuracy decreases. Our result showed that a more visually intuitive representation of input features for CNN-based classification models yielded higher performance, especially when the training sample size was small. This implies that the proposed graph-based CNNs would be useful for land cover classification where reference data are limited.
Collapse
|