1
|
Qu G, Wu Y, Lv Z, Zhao D, Lu Y, Zhou K, Tang J, Zhang Q, Zhang A. Road-MobileSeg: Lightweight and Accurate Road Extraction Model from Remote Sensing Images for Mobile Devices. SENSORS (BASEL, SWITZERLAND) 2024; 24:531. [PMID: 38257624 PMCID: PMC10819684 DOI: 10.3390/s24020531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 01/24/2024]
Abstract
Current road extraction models from remote sensing images based on deep learning are computationally demanding and memory-intensive because of their high model complexity, making them impractical for mobile devices. This study aimed to develop a lightweight and accurate road extraction model, called Road-MobileSeg, to address the problem of automatically extracting roads from remote sensing images on mobile devices. The Road-MobileFormer was designed as the backbone structure of Road-MobileSeg. In the Road-MobileFormer, the Coordinate Attention Module was incorporated to encode both channel relationships and long-range dependencies with precise position information for the purpose of enhancing the accuracy of road extraction. Additionally, the Micro Token Pyramid Module was introduced to decrease the number of parameters and computations required by the model, rendering it more lightweight. Moreover, three model structures, namely Road-MobileSeg-Tiny, Road-MobileSeg-Small, and Road-MobileSeg-Base, which share a common foundational structure but differ in the quantity of parameters and computations, were developed. These models varied in complexity and were available for use on mobile devices with different memory capacities and computing power. The experimental results demonstrate that the proposed models outperform the compared typical models in terms of accuracy, lightweight structure, and latency and achieve high accuracy and low latency on mobile devices. This indicates that the models that integrate with the Coordinate Attention Module and the Micro Token Pyramid Module surpass the limitations of current research and are suitable for road extraction from remote sensing images on mobile devices.
Collapse
Affiliation(s)
- Guangjun Qu
- School of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing 100020, China; (G.Q.); (Y.L.)
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100045, China; (Y.W.); (K.Z.)
| | - Yue Wu
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100045, China; (Y.W.); (K.Z.)
| | - Zhihong Lv
- College of Ocean Technology and Surveying, Jiangsu Ocean University, Lianyungang 222000, China;
| | - Dequan Zhao
- School of Information Science and Engineering, Shandong Agricultural University, Tai’an 271000, China;
| | - Yingpeng Lu
- School of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing 100020, China; (G.Q.); (Y.L.)
| | - Kefa Zhou
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100045, China; (Y.W.); (K.Z.)
| | - Jiakui Tang
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
- Yanshan Earth Key Zone and Surface Flux Observation and Research Station, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Qing Zhang
- Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100045, China; (Y.W.); (K.Z.)
- Institute of Aerospace Information Innovation, Chinese Academy of Sciences, Beijing 100045, China
| | - Aijun Zhang
- School of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing 100020, China; (G.Q.); (Y.L.)
| |
Collapse
|
2
|
Abstract
After the revival of deep learning in computer vision in 2012, SAR ship detection comes into the deep learning era too. The deep learning-based computer vision algorithms can work in an end-to-end pipeline, without the need of designing features manually, and they have amazing performance. As a result, it is also used to detect ships in SAR images. The beginning of this direction is the paper we published in 2017BIGSARDATA, in which the first dataset SSDD was used and shared with peers. Since then, lots of researchers focus their attention on this field. In this paper, we analyze the past, present, and future of the deep learning-based ship detection algorithms in SAR images. In the past section, we analyze the difference between traditional CFAR (constant false alarm rate) based and deep learning-based detectors through theory and experiment. The traditional method is unsupervised while the deep learning is strongly supervised, and their performance varies several times. In the present part, we analyze the 177 published papers about SAR ship detection. We highlight the dataset, algorithm, performance, deep learning framework, country, timeline, etc. After that, we introduce the use of single-stage, two-stage, anchor-free, train from scratch, oriented bounding box, multi-scale, and real-time detectors in detail in the 177 papers. The advantages and disadvantages of speed and accuracy are also analyzed. In the future part, we list the problem and direction of this field. We can find that, in the past five years, the AP50 has boosted from 78.8% in 2017 to 97.8 % in 2022 on SSDD. Additionally, we think that researchers should design algorithms according to the specific characteristics of SAR images. What we should do next is to bridge the gap between SAR ship detection and computer vision by merging the small datasets into a large one and formulating corresponding standards and benchmarks. We expect that this survey of 177 papers can make people better understand these algorithms and stimulate more research in this field.
Collapse
|
3
|
A Residual Attention and Local Context-Aware Network for Road Extraction from High-Resolution Remote Sensing Imagery. REMOTE SENSING 2021. [DOI: 10.3390/rs13244958] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Extracting road information from high-resolution remote sensing images (HRI) can provide crucial geographic information for many applications. With the improvement of remote sensing image resolution, the image data contain more abundant feature information. However, this phenomenon also enhances the spatial heterogeneity between different types of roads, making it difficult to accurately discern the road and non-road regions using only spectral characteristics. To remedy the above issues, a novel residual attention and local context-aware network (RALC-Net) is proposed for extracting a complete and continuous road network from HRI. RALC-Net utilizes a dual-encoder structure to improve the feature extraction capability of the network, whose two different branches take different feature information as input data. Specifically, we construct the residual attention module using the residual connection that can integrate spatial context information and the attention mechanism, highlighting local semantics to extract local feature information of roads. The residual attention module combines the characteristics of both the residual connection and the attention mechanism to retain complete road edge information, highlight essential semantics, and enhance the generalization capability of the network model. In addition, the multi-scale dilated convolution module is used to extract multi-scale spatial receptive fields to improve the model’s performance further. We perform experiments to verify the performance of each component of RALC-Net through the ablation study. By combining low-level features with high-level semantics, we extract road information and make comparisons with other state-of-the-art models. The experimental results show that the proposed RALC-Net has excellent feature representation ability and robust generalizability, and can extract complete road information from a complex environment.
Collapse
|
4
|
Detection of Inflatable Boats and People in Thermal Infrared with Deep Learning Methods. SENSORS 2021; 21:s21165330. [PMID: 34450770 PMCID: PMC8401691 DOI: 10.3390/s21165330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 07/29/2021] [Accepted: 08/05/2021] [Indexed: 12/02/2022]
Abstract
Smuggling of drugs and cigarettes in small inflatable boats across border rivers is a serious threat to the EU’s financial interests. Early detection of such threats is challenging due to difficult and changing environmental conditions. This study reports on the automatic detection of small inflatable boats and people in a rough wild terrain in the infrared thermal domain. Three acquisition campaigns were carried out during spring, summer, and fall under various weather conditions. Three deep learning algorithms, namely, YOLOv2, YOLOv3, and Faster R-CNN working with six different feature extraction neural networks were trained and evaluated in terms of performance and processing time. The best performance was achieved with Faster R-CNN with ResNet101, however, processing requires a long time and a powerful graphics processing unit.
Collapse
|