1
|
O’Malley D, Delorey AA, Guiltinan EJ, Ma Z, Kadeethum T, Lackey G, Lee J, E. Santos J, Follansbee E, Nair MC, Pekney NJ, Jahan I, Mehana M, Hora P, Carey JW, Govert A, Varadharajan C, Ciulla F, Biraud SC, Jordan P, Dubey M, Santos A, Wu Y, Kneafsey TJ, Dubey MK, Weiss CJ, Downs C, Boutot J, Kang M, Viswanathan H. Unlocking Solutions: Innovative Approaches to Identifying and Mitigating the Environmental Impacts of Undocumented Orphan Wells in the United States. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:19584-19594. [PMID: 39344066 PMCID: PMC11542881 DOI: 10.1021/acs.est.4c02069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 09/06/2024] [Accepted: 09/09/2024] [Indexed: 10/01/2024]
Abstract
In the United States, hundreds of thousands of undocumented orphan wells have been abandoned, leaving the burden of managing environmental hazards to governmental agencies or the public. These wells, a result of over a century of fossil fuel extraction without adequate regulation, lack basic information like location and depth, emit greenhouse gases, and leak toxic substances into groundwater. For most of these wells, basic information such as well location and depth is unknown or unverified. Addressing this issue necessitates innovative and interdisciplinary approaches for locating, characterizing, and mitigating their environmental impacts. Our survey of the United States revealed the need for tools to identify well locations and assess conditions, prompting the development of technologies including machine learning to automatically extract information from old records (95%+ accuracy), remote sensing technologies like aero-magnetometers to find buried wells, and cost-effective methods for estimating methane emissions. Notably, fixed-wing drones equipped with magnetometers have emerged as cost-effective and efficient for discovering unknown wells, offering advantages over helicopters and quadcopters. Efforts also involved leveraging local knowledge through outreach to state and tribal governments as well as citizen science initiatives. These initiatives aim to significantly contribute to environmental sustainability by reducing greenhouse gases and improving air and water quality.
Collapse
Affiliation(s)
- Daniel O’Malley
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Andrew A. Delorey
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Eric J. Guiltinan
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Zhiwei Ma
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | | | - Greg Lackey
- National
Energy Technology Laboratory, Pittsburgh, Pennsylvania 15236, United States
| | - James Lee
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Javier E. Santos
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Emily Follansbee
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Manoj C. Nair
- National
Oceanic and Atmospheric Administration, Washington D.C. 20230, United States
- Cooperative
Institute for Research in Environmental Sciences, University of Colorado at Boulder, Boulder, Colorado 80309, United States
| | - Natalie J. Pekney
- National
Energy Technology Laboratory, Pittsburgh, Pennsylvania 15236, United States
| | - Ismot Jahan
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Mohamed Mehana
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Priya Hora
- Sandia
National Laboratory, Albuquerque, New Mexico 87123, United States
| | - J. William Carey
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Andrew Govert
- Department
of Energy, Washington D.C. 20585, United
States
| | | | - Fabio Ciulla
- Lawrence
Berkeley Laboratory, Berkeley, California 94720, United States
| | | | - Preston Jordan
- Lawrence
Berkeley Laboratory, Berkeley, California 94720, United States
| | - Mohit Dubey
- Lawrence
Berkeley Laboratory, Berkeley, California 94720, United States
| | - Andre Santos
- Lawrence
Berkeley Laboratory, Berkeley, California 94720, United States
| | - Yuxin Wu
- Lawrence
Berkeley Laboratory, Berkeley, California 94720, United States
| | | | - Manvendra K. Dubey
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Chester J. Weiss
- Sandia
National Laboratory, Albuquerque, New Mexico 87123, United States
| | - Christine Downs
- Sandia
National Laboratory, Albuquerque, New Mexico 87123, United States
| | - Jade Boutot
- McGill University, Montreal, Quebec H3A 0G4, Canada
| | - Mary Kang
- McGill University, Montreal, Quebec H3A 0G4, Canada
| | - Hari Viswanathan
- Los
Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
2
|
Wei D, Zhang W, Li H, Jiang Y, Xian Y, Deng J. RTINet: A Lightweight and High-Performance Railway Turnout Identification Network Based on Semantic Segmentation. ENTROPY (BASEL, SWITZERLAND) 2024; 26:878. [PMID: 39451954 PMCID: PMC11507317 DOI: 10.3390/e26100878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 10/16/2024] [Accepted: 10/18/2024] [Indexed: 10/26/2024]
Abstract
To lighten the workload of train drivers and enhance railway transportation safety, a novel and intelligent method for railway turnout identification is investigated based on semantic segmentation. More specifically, a railway turnout scene perception (RTSP) dataset is constructed and annotated manually in this paper, wherein the innovative concept of side rails is introduced as part of the labeling process. After that, based on the work of Deeplabv3+, combined with a lightweight design and an attention mechanism, a railway turnout identification network (RTINet) is proposed. Firstly, in consideration of the need for rapid response in the deployment of the identification model on high-speed trains, this paper selects the MobileNetV2 network, renowned for its suitability for lightweight deployment, as the backbone of the RTINet model. Secondly, to reduce the computational load of the model while ensuring accuracy, depth-separable convolutions are employed to replace the standard convolutions within the network architecture. Thirdly, the bottleneck attention module (BAM) is integrated into the model to enhance position and feature information perception, bolster the robustness and quality of the segmentation masks generated, and ensure that the outcomes are characterized by precision and reliability. Finally, to address the issue of foreground and background imbalance in turnout recognition, the Dice loss function is incorporated into the network training procedure. Both the quantitative and qualitative experimental results demonstrate that the proposed method is feasible for railway turnout identification, and it outperformed the compared baseline models. In particular, the RTINet was able to achieve a remarkable mIoU of 85.94%, coupled with an inference speed of 78 fps on the customized dataset. Furthermore, the effectiveness of each optimized component of the proposed RTINet is verified by an additional ablation study.
Collapse
Affiliation(s)
- Dehua Wei
- School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China; (H.L.); (Y.J.); (Y.X.); (J.D.)
- Key Laboratory of Railway Industry on Plateau Railway Transportation Intelligent Management and Control, Lanzhou Jiaotong University, Lanzhou 730070, China
| | - Wenjun Zhang
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611730, China;
| | - Haijun Li
- School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China; (H.L.); (Y.J.); (Y.X.); (J.D.)
- Key Laboratory of Railway Industry on Plateau Railway Transportation Intelligent Management and Control, Lanzhou Jiaotong University, Lanzhou 730070, China
| | - Yuxing Jiang
- School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China; (H.L.); (Y.J.); (Y.X.); (J.D.)
- Key Laboratory of Railway Industry on Plateau Railway Transportation Intelligent Management and Control, Lanzhou Jiaotong University, Lanzhou 730070, China
| | - Yong Xian
- School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China; (H.L.); (Y.J.); (Y.X.); (J.D.)
- Key Laboratory of Railway Industry on Plateau Railway Transportation Intelligent Management and Control, Lanzhou Jiaotong University, Lanzhou 730070, China
| | - Jiangli Deng
- School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China; (H.L.); (Y.J.); (Y.X.); (J.D.)
| |
Collapse
|
3
|
Boyd C, Brown GC, Kleinig TJ, Mayer W, Dawson J, Jenkinson M, Bezak E. Hyperparameter selection for dataset-constrained semantic segmentation: Practical machine learning optimization. J Appl Clin Med Phys 2024:e14542. [PMID: 39387832 DOI: 10.1002/acm2.14542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 07/23/2024] [Accepted: 09/08/2024] [Indexed: 10/15/2024] Open
Abstract
PURPOSE/AIM This paper provides a pedagogical example for systematic machine learning optimization in small dataset image segmentation, emphasizing hyperparameter selections. A simple process is presented for medical physicists to examine hyperparameter optimization. This is also applied to a case-study, demonstrating the benefit of the method. MATERIALS AND METHODS An unrestricted public Computed Tomography (CT) dataset, with binary organ segmentation, was used to develop a multiclass segmentation model. To start the optimization process, a preliminary manual search of hyperparameters was conducted and from there a grid search identified the most influential result metrics. A total of 658 different models were trained in 2100 h, using 13 160 effective patients. The quantity of results was analyzed using random forest regression, identifying relative hyperparameter impact. RESULTS Metric implied segmentation quality (accuracy 96.8%, precision 95.1%) and visual inspection were found to be mismatched. In this work batch normalization was most important, but performance varied with hyperparameters and metrics selected. Targeted grid-search optimization and random forest analysis of relative hyperparameter importance, was an easily implementable sensitivity analysis approach. CONCLUSION The proposed optimization method gives a systematic and quantitative approach to something intuitively understood, that hyperparameters change model performance. Even just grid search optimization with random forest analysis presented here can be informative within hardware and data quality/availability limitations, adding confidence to model validity and minimize decision-making risks. By providing a guided methodology, this work helps medical physicists to improve their model optimization, irrespective of specific challenges posed by datasets and model design.
Collapse
Affiliation(s)
- Chris Boyd
- Allied Health and Human Performance, University of South Australia, Adelaide, Australia
- Medical Physics and Radiation Safety, South Australia Medical Imaging, Adelaide, Australia
| | - Gregory C Brown
- Allied Health and Human Performance, University of South Australia, Adelaide, Australia
| | - Timothy J Kleinig
- Department of Neurology, Royal Adelaide Hospital, Adelaide, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, Australia
| | - Wolfgang Mayer
- Discipline of Surgery, University of Adelaide, Adelaide, Australia
| | - Joseph Dawson
- Department of Vascular and Endovascular Surgery, Royal Adelaide Hospital, Adelaide, Australia
- Industrial AI Research Centre, UniSA STEM, University of South Australia, Adelaide, Australia
| | - Mark Jenkinson
- Australian Institute for Machine Learning (AIML), School of Computer and Mathematical Sciences, University of Adelaide, Adelaide, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, Australia
- Wellcome Trust Centre for Integrative Neuroimaging (WIN), Nuffield Department of Clinical Neurosciences (FMRIB), University of Oxford, Oxford, UK
| | - Eva Bezak
- Allied Health and Human Performance, University of South Australia, Adelaide, Australia
- Department of Physics, University of Adelaide, Adelaide, Australia
| |
Collapse
|
4
|
Sigurðardóttir AR, Sveinsdóttir HI, Schultz N, Einarsson H, Gudjónsdóttir M. Sequence Segmentation of Nematodes in Atlantic Cod with Multispectral Imaging Data. Foods 2024; 13:2952. [PMID: 39335880 PMCID: PMC11430828 DOI: 10.3390/foods13182952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 09/13/2024] [Accepted: 09/15/2024] [Indexed: 09/30/2024] Open
Abstract
Nematodes pose significant challenges for the fish processing industry, particularly in white fish. Despite technological advances, the industry still depends on manual labor for the detection and extraction of nematodes. This study addresses the initial steps of automatic nematode detection and differentiation from other common defects in fish fillets, such as skin remnants and blood spots. VideometerLab 4, an advanced Multispectral Imaging (MSI) System, was used to acquire 270 images of 50 Atlantic cod fillets under controlled conditions. In total, 173 nematodes were labeled using the Segment Anything Model (SAM), which is trained to automatically segment objects of interest from only few representative pixels. With the acquired dataset, we study the potential of identifying nematodes through their spectral signature. We incorporated normalized Canonical Discriminant Analysis (nCDA) to develop segmentation models trained to distinguish between different components within the fish fillets. By incorporating multiple segmentation models, we aimed to achieve a satisfactory balance between false negatives and false positives. This resulted in 88% precision and 79% recall for our annotated test data. This approach could improve process control by accurately identifying fillets with nematodes. Using MSI minimizes unnecessary inspection of fillets in good condition and concurrently boosts product safety and quality.
Collapse
Affiliation(s)
- Andrea Rakel Sigurðardóttir
- Faculty of Food Science and Nutrition, University of Iceland, Sæmundargata 12, 102 Reykjavík, Iceland; (A.R.S.); (H.I.S.)
| | - Hildur Inga Sveinsdóttir
- Faculty of Food Science and Nutrition, University of Iceland, Sæmundargata 12, 102 Reykjavík, Iceland; (A.R.S.); (H.I.S.)
- Matís, Food and Biotech R&D, Vínlandsleið 12, 113 Reykjavík, Iceland
| | | | - Hafsteinn Einarsson
- Faculty of Computer Science, University of Iceland, Bjargargata 1, 102 Reykjavík, Iceland;
| | - María Gudjónsdóttir
- Faculty of Food Science and Nutrition, University of Iceland, Sæmundargata 12, 102 Reykjavík, Iceland; (A.R.S.); (H.I.S.)
- Matís, Food and Biotech R&D, Vínlandsleið 12, 113 Reykjavík, Iceland
| |
Collapse
|
5
|
Wang W, Chen Q, Shen Y, Xiang Z. Leakage Identification of Underground Structures Using Classification Deep Neural Networks and Transfer Learning. SENSORS (BASEL, SWITZERLAND) 2024; 24:5569. [PMID: 39275478 PMCID: PMC11397748 DOI: 10.3390/s24175569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 08/19/2024] [Accepted: 08/24/2024] [Indexed: 09/16/2024]
Abstract
Water leakage defects often occur in underground structures, leading to accelerated structural aging and threatening structural safety. Leakage identification can detect early diseases of underground structures and provide important guidance for reinforcement and maintenance. Deep learning-based computer vision methods have been rapidly developed and widely used in many fields. However, establishing a deep learning model for underground structure leakage identification usually requires a lot of training data on leakage defects, which is very expensive. To overcome the data shortage, a deep neural network method for leakage identification is developed based on transfer learning in this paper. For comparison, four famous classification models, including VGG16, AlexNet, SqueezeNet, and ResNet18, are constructed. To train the classification models, a transfer learning strategy is developed, and a dataset of underground structure leakage is created. Finally, the classification performance on the leakage dataset of different deep learning models is comparatively studied under different sizes of training data. The results showed that the VGG16, AlexNet, and SqueezeNet models with transfer learning can overall provide higher and more stable classification performance on the leakage dataset than those without transfer learning. The ResNet18 model with transfer learning can overall provide a similar value of classification performance on the leakage dataset than that without transfer learning, but its classification performance is more stable than that without transfer learning. In addition, the SqueezeNet model obtains an overall higher and more stable performance than the comparative models on the leakage dataset for all classification metrics.
Collapse
Affiliation(s)
- Wenyang Wang
- Shandong Zhiyuan Electric Power Design Consulting Co., Ltd., Jinan 250021, China
- Economic & Technology Research Institute of State Grid Shandong Electric Power Company, Jinan 250021, China
| | - Qingwei Chen
- Shandong Zhiyuan Electric Power Design Consulting Co., Ltd., Jinan 250021, China
- Economic & Technology Research Institute of State Grid Shandong Electric Power Company, Jinan 250021, China
| | - Yongjiang Shen
- Hunan Province Key Laboratory for Disaster Prevention and Mitigation of Rail Transit Engineering Structure, Central South University, Changsha 410075, China
- School of Civil Engineering, Central South University, Changsha 410075, China
| | - Zhengliang Xiang
- Hunan Province Key Laboratory for Disaster Prevention and Mitigation of Rail Transit Engineering Structure, Central South University, Changsha 410075, China
- School of Civil Engineering, Central South University, Changsha 410075, China
| |
Collapse
|
6
|
Chen C, Chen Y, Li X, Ning H, Xiao R. Linear semantic transformation for semi-supervised medical image segmentation. Comput Biol Med 2024; 173:108331. [PMID: 38522252 DOI: 10.1016/j.compbiomed.2024.108331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 02/29/2024] [Accepted: 03/17/2024] [Indexed: 03/26/2024]
Abstract
Medical image segmentation is a focus research and foundation in developing intelligent medical systems. Recently, deep learning for medical image segmentation has become a standard process and succeeded significantly, promoting the development of reconstruction, and surgical planning of disease diagnosis. However, semantic learning is often inefficient owing to the lack of supervision of feature maps, resulting in that high-quality segmentation models always rely on numerous and accurate data annotations. Learning robust semantic representation in latent spaces remains a challenge. In this paper, we propose a novel semi-supervised learning framework to learn vital attributes in medical images, which constructs generalized representation from diverse semantics to realize medical image segmentation. We first build a self-supervised learning part that achieves context recovery by reconstructing space and intensity of medical images, which conduct semantic representation for feature maps. Subsequently, we combine semantic-rich feature maps and utilize simple linear semantic transformation to convert them into image segmentation. The proposed framework was tested using five medical segmentation datasets. Quantitative assessments indicate the highest scores of our method on IXI (73.78%), ScaF (47.50%), COVID-19-Seg (50.72%), PC-Seg (65.06%), and Brain-MR (72.63%) datasets. Finally, we compared our method with the latest semi-supervised learning methods and obtained 77.15% and 75.22% DSC values, respectively, ranking first on two representative datasets. The experimental results not only proved that the proposed linear semantic transformation was effectively applied to medical image segmentation, but also presented its simplicity and ease-of-use to pursue robust segmentation in semi-supervised learning. Our code is now open at: https://github.com/QingYunA/Linear-Semantic-Transformation-for-Semi-Supervised-Medical-Image-Segmentation.
Collapse
Affiliation(s)
- Cheng Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Yunqing Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Xiaoheng Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Huansheng Ning
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Ruoxiu Xiao
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, 100024, China.
| |
Collapse
|
7
|
Gao C, Shi Y, Yang S, Lei B. SAA-SDM: Neural Networks Faster Learned to Segment Organ Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:547-562. [PMID: 38343217 PMCID: PMC11031521 DOI: 10.1007/s10278-023-00947-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 10/16/2023] [Accepted: 10/18/2023] [Indexed: 04/20/2024]
Abstract
In the field of medicine, rapidly and accurately segmenting organs in medical images is a crucial application of computer technology. This paper introduces a feature map module, Strength Attention Area Signed Distance Map (SAA-SDM), based on the principal component analysis (PCA) principle. The module is designed to accelerate neural networks' convergence speed in rapidly achieving high precision. SAA-SDM provides the neural network with confidence information regarding the target and background, similar to the signed distance map (SDM), thereby enhancing the network's understanding of semantic information related to the target. Furthermore, this paper presents a training scheme tailored for the module, aiming to achieve finer segmentation and improved generalization performance. Validation of our approach is carried out using TRUS and chest X-ray datasets. Experimental results demonstrate that our method significantly enhances neural networks' convergence speed and precision. For instance, the convergence speed of UNet and UNET + + is improved by more than 30%. Moreover, Segformer achieves an increase of over 6% and 3% in mIoU (mean Intersection over Union) on two test datasets without requiring pre-trained parameters. Our approach reduces the time and resource costs associated with training neural networks for organ segmentation tasks while effectively guiding the network to achieve meaningful learning even without pre-trained parameters.
Collapse
Affiliation(s)
- Chao Gao
- College of Computer and Information Technology, China Three Gorges University, Yichang Hubei, 443002, China
- Hubei Key Laboratory of Intelligent Vision Monitoring for Hydropower Engineering, China Three Gorges University, Yichang Hubei, 443002, China
| | - Yongtao Shi
- College of Computer and Information Technology, China Three Gorges University, Yichang Hubei, 443002, China.
- Hubei Key Laboratory of Intelligent Vision Monitoring for Hydropower Engineering, China Three Gorges University, Yichang Hubei, 443002, China.
| | - Shuai Yang
- College of Computer and Information Technology, China Three Gorges University, Yichang Hubei, 443002, China
- Hubei Key Laboratory of Intelligent Vision Monitoring for Hydropower Engineering, China Three Gorges University, Yichang Hubei, 443002, China
| | - Bangjun Lei
- College of Computer and Information Technology, China Three Gorges University, Yichang Hubei, 443002, China
- Hubei Key Laboratory of Intelligent Vision Monitoring for Hydropower Engineering, China Three Gorges University, Yichang Hubei, 443002, China
| |
Collapse
|
8
|
Ruenjit S, Siricharoen P, Khamwan K. Automated size-specific dose estimates framework in thoracic CT using convolutional neural network based on U-Net model. J Appl Clin Med Phys 2024; 25:e14283. [PMID: 38295146 DOI: 10.1002/acm2.14283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 12/16/2023] [Accepted: 12/27/2023] [Indexed: 02/02/2024] Open
Abstract
PURPOSE This study aimed to develop an automated method that uses a convolutional neural network (CNN) for calculating size-specific dose estimates (SSDEs) based on the corrected effective diameter (Deff corr ) in thoracic computed tomography (CT). METHODS Transaxial images obtained from 108 adult patients who underwent non-contrast thoracic CT scans were analyzed. To calculate the Deff corr according to Mihailidis et al., the average relative electron densities for lung, bone, and other tissues were used to correct the lateral and anterior-posterior dimensions. The CNN architecture based on the U-Net algorithm was used for automated segmentation of three classes of tissues and the background region to calculate dimensions and Deff corr values. Then, 108 thoracic CT images and generated segmentation masks were used for network training. The water-equivalent diameter (Dw ) was determined according to the American Association of Physicists in Medicine Task Group 220. Linear regression and Bland-Altman analysis were performed to determine the correlations between SSDEDeff corr(automated) , SSDEDeff corr(manual) , and SSDEDw . RESULTS High agreement was obtained between the manual and automated methods for calculating the Deff corr SSDE. The mean values for the SSDEDeff corr(manual) , SSDEDw , and SSDEDeff corr(automated) were 14.3 ± 2.1 mGy, 14.6 ± 2.2 mGy, and 14.5 ± 2.4 mGy, respectively. The U-Net model was successfully trained and used to accurately predict SSDEs, with results comparable to manual-labeling results. CONCLUSION The proposed automated framework using a CNN offers a reliable and efficient solution for determining the Deff corr SSDE in thoracic CT.
Collapse
Affiliation(s)
- Sakultala Ruenjit
- Medical Physics Program, Department of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Division of Diagnostic Radiology, Department of Radiology, King Chulalongkorn Memorial Hospital, The Thai Red Cross Society, Bangkok, Thailand
- Chulalongkorn University Biomedical Imaging Group, Depertment of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Punnarai Siricharoen
- The Perceptual Intelligent Computing Lab, Department of Computer Engineering, Faculty of, Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Kitiwat Khamwan
- Medical Physics Program, Department of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Chulalongkorn University Biomedical Imaging Group, Depertment of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Division of Nuclear Medicine, Department of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
9
|
Xu S, Yang X, Zhang S, Zheng X, Zheng F, Liu Y, Zhang H, Li L, Ye Q. Evaluation of the corneal topography based on deep learning. Front Med (Lausanne) 2024; 10:1264659. [PMID: 38239613 PMCID: PMC10794654 DOI: 10.3389/fmed.2023.1264659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 12/07/2023] [Indexed: 01/22/2024] Open
Abstract
Purpose The current study designed a unique type of corneal topography evaluation method based on deep learning and traditional image processing algorithms. The type of corneal topography of patients was evaluated through the segmentation of important medical zones and the calculation of relevant medical indicators of orthokeratology (OK) lenses. Methods The clinical data of 1,302 myopic subjects was collected retrospectively. A series of neural network-based U-Net was used to segment the pupil and the treatment zone in the corneal topography, and the decentration, effective defocusing contact range, and other indicators were calculated according to the image processing algorithm. The type of corneal topography was evaluated according to the evaluation criteria given by the optometrist. Finally, the method described in this article was used to evaluate the type of corneal topography and compare it with the type classified by the optometrist. Results When the important medical zones in the corneal topography were segmented, the precision and recall of the treatment zone reached 0.9587 and 0.9459, respectively, and the precision and recall of the pupil reached 0.9771 and 0.9712. Finally, the method described in this article was used to evaluate the type of corneal topography. When the reviewed findings based on deep learning and image processing algorithms were compared to the type of corneal topography marked by the professional optometrist, they demonstrated high accuracy with more than 98%. Conclusion The current study provided an effective and accurate deep learning algorithm to evaluate the type of corneal topography. The deep learning algorithm played an auxiliary role in the OK lens fitting, which could help optometrists select the parameters of OK lenses effectively.
Collapse
Affiliation(s)
- Shuai Xu
- Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics and TEDA Applied Physics, Nankai University, Tianjin, China
| | - Xiaoyan Yang
- Tianjin Eye Hospital, Tianjin, China
- Tianjin Key Lab of Ophthalmology and Visual Science, Tianjin, China
- Nankai University Affiliated Eye Hospital, Tianjin, China
- Eye Hospital Optometric Center, Tianjin, China
| | - Shuxian Zhang
- Tianjin Eye Hospital, Tianjin, China
- Tianjin Key Lab of Ophthalmology and Visual Science, Tianjin, China
- Nankai University Affiliated Eye Hospital, Tianjin, China
- Eye Hospital Optometric Center, Tianjin, China
| | - Xuan Zheng
- Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics and TEDA Applied Physics, Nankai University, Tianjin, China
| | - Fang Zheng
- Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics and TEDA Applied Physics, Nankai University, Tianjin, China
| | - Yin Liu
- School of Medicine, Nankai University, Tianjin, China
| | - Hanyu Zhang
- School of Medicine, Nankai University, Tianjin, China
| | - Lihua Li
- Tianjin Eye Hospital, Tianjin, China
- Tianjin Key Lab of Ophthalmology and Visual Science, Tianjin, China
- Nankai University Affiliated Eye Hospital, Tianjin, China
- Eye Hospital Optometric Center, Tianjin, China
| | - Qing Ye
- Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics and TEDA Applied Physics, Nankai University, Tianjin, China
| |
Collapse
|
10
|
Nan G, Li H, Du H, Liu Z, Wang M, Xu S. A Semantic Segmentation Method Based on AS-Unet++ for Power Remote Sensing of Images. SENSORS (BASEL, SWITZERLAND) 2024; 24:269. [PMID: 38203131 PMCID: PMC10781366 DOI: 10.3390/s24010269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 12/16/2023] [Accepted: 12/18/2023] [Indexed: 01/12/2024]
Abstract
In order to achieve the automatic planning of power transmission lines, a key step is to precisely recognize the feature information of remote sensing images. Considering that the feature information has different depths and the feature distribution is not uniform, a semantic segmentation method based on a new AS-Unet++ is proposed in this paper. First, the atrous spatial pyramid pooling (ASPP) and the squeeze-and-excitation (SE) module are added to traditional Unet, such that the sensing field can be expanded and the important features can be enhanced, which is called AS-Unet. Second, an AS-Unet++ structure is built by using different layers of AS-Unet, such that the feature extraction parts of each layer of AS-Unet are stacked together. Compared with Unet, the proposed AS-Unet++ automatically learns features at different depths and determines a depth with optimal performance. Once the optimal number of network layers is determined, the excess layers can be pruned, which will greatly reduce the number of trained parameters. The experimental results show that the overall recognition accuracy of AS-Unet++ is significantly improved compared to Unet.
Collapse
Affiliation(s)
| | | | - Haibo Du
- School of Electrical Engineering and Automation, Hefei University of Technology, Hefei 230009, China; (G.N.); (H.L.); (Z.L.); (M.W.); (S.X.)
| | | | | | | |
Collapse
|
11
|
Jiang SB, Sun YW, Xu S, Zhang HX, Wu ZF. Semi-supervised segmentation of metal-artifact contaminated industrial CT images using improved CycleGAN. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:271-283. [PMID: 38217629 DOI: 10.3233/xst-230233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2024]
Abstract
Accurate segmentation of industrial CT images is of great significance in industrial fields such as quality inspection and defect analysis. However, reconstruction of industrial CT images often suffers from typical metal artifacts caused by factors like beam hardening, scattering, statistical noise, and partial volume effects. Traditional segmentation methods are difficult to achieve precise segmentation of CT images mainly due to the presence of these metal artifacts. Furthermore, acquiring paired CT image data required by fully supervised networks proves to be extremely challenging. To address these issues, this paper introduces an improved CycleGAN approach for achieving semi-supervised segmentation of industrial CT images. This method not only eliminates the need for removing metal artifacts and noise, but also enables the direct conversion of metal artifact-contaminated images into segmented images without the requirement of paired data. The average values of quantitative assessment of image segmentation performance can reach 0.96645 for Dice Similarity Coefficient(Dice) and 0.93718 for Intersection over Union(IoU). In comparison to traditional segmentation methods, it presents significant improvements in both quantitative metrics and visual quality, provides valuable insights for further research.
Collapse
Affiliation(s)
- Shi Bo Jiang
- Institute of Nuclear and New Energy Technology, Tsinghua University, BeiJing, China
- Tsinghua University-Beijing Key Laboratory of Nuclear Detection Technology
| | - Yue Wen Sun
- Institute of Nuclear and New Energy Technology, Tsinghua University, BeiJing, China
- Tsinghua University-Beijing Key Laboratory of Nuclear Detection Technology
| | - Shuo Xu
- Institute of Nuclear and New Energy Technology, Tsinghua University, BeiJing, China
- Tsinghua University-Beijing Key Laboratory of Nuclear Detection Technology
| | - Hua Xia Zhang
- Institute of Nuclear and New Energy Technology, Tsinghua University, BeiJing, China
- Tsinghua University-Beijing Key Laboratory of Nuclear Detection Technology
| | - Zhi Fang Wu
- Institute of Nuclear and New Energy Technology, Tsinghua University, BeiJing, China
- Tsinghua University-Beijing Key Laboratory of Nuclear Detection Technology
| |
Collapse
|
12
|
Wang Y, Yu X, Yang Y, Zhang X, Zhang Y, Zhang L, Feng R, Xue J. A multi-branched semantic segmentation network based on twisted information sharing pattern for medical images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107914. [PMID: 37992569 DOI: 10.1016/j.cmpb.2023.107914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/12/2023] [Accepted: 11/03/2023] [Indexed: 11/24/2023]
Abstract
BACKGROUND Semantic segmentation plays an indispensable role in clinical diagnosis support, intelligent surgical assistance, personalized treatment planning, and drug development, making it a core area of research in smart healthcare. However, the main challenge in medical image semantic segmentation lies in the accuracy bottleneck, primarily due to the low interactivity of feature information and the lack of deep exploration of local features during feature fusion. METHODS To address this issue, a novel approach called Twisted Information-sharing Pattern for Multi-branched Network (TP-MNet) has been proposed. This architecture facilitates the mutual transfer of features among neighboring branches at the next level, breaking the barrier of semantic isolation and achieving the goal of semantic fusion. Additionally, performing a secondary feature mining during the transfer process effectively enhances the detection accuracy. Building upon the Twisted Pattern transmission in the encoding and decoding stages, enhanced and refined modules for feature fusion have been developed. These modules aim to capture key features of lesions by acquiring contextual semantic information in a broader context. RESULTS The experiments extensively and objectively validated the TP-MNet on 5 medical datasets and compared it with 21 other semantic segmentation models using 7 metrics. Through metric analysis, image comparisons, process examination, and ablation tests, the superiority of TP-MNet was convincingly demonstrated. Additionally, further investigations were conducted to explore the limitations of TP-MNet, thereby clarifying the practical utility of the Twisted Information-sharing Pattern. CONCLUSIONS TP-MNet adopts the Twisted Information-sharing Pattern, leading to a substantial improvement in the semantic fusion effect and directly contributing to enhanced segmentation performance on medical images. Additionally, this semantic broadcasting mode not only underscores the importance of semantic fusion but also highlights a pivotal direction for the advancement of multi-branched architectures.
Collapse
Affiliation(s)
- Yuefei Wang
- College of Computer Science, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| | - Xi Yu
- Stirling College, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China.
| | - Yixi Yang
- Institute of Cancer Biology and Drug Discovery, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| | - Xiang Zhang
- College of Computer Science, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| | - Yutong Zhang
- College of Computer Science, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| | - Li Zhang
- College of Computer Science, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| | - Ronghui Feng
- Stirling College, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| | - Jiajing Xue
- Stirling College, Chengdu University, 2025 Chengluo Rd., Chengdu, Sichuan 610106, China
| |
Collapse
|
13
|
Nguyen N, Bohak C, Engel D, Mindek P, Strnad O, Wonka P, Li S, Ropinski T, Viola I. Finding Nano-Ötzi: Cryo-Electron Tomography Visualization Guided by Learned Segmentation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:4198-4214. [PMID: 35749328 DOI: 10.1109/tvcg.2022.3186146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Cryo-electron tomography (cryo-ET) is a new 3D imaging technique with unprecedented potential for resolving submicron structural details. Existing volume visualization methods, however, are not able to reveal details of interest due to low signal-to-noise ratio. In order to design more powerful transfer functions, we propose leveraging soft segmentation as an explicit component of visualization for noisy volumes. Our technical realization is based on semi-supervised learning, where we combine the advantages of two segmentation algorithms. First, the weak segmentation algorithm provides good results for propagating sparse user-provided labels to other voxels in the same volume and is used to generate dense pseudo-labels. Second, the powerful deep-learning-based segmentation algorithm learns from these pseudo-labels to generalize the segmentation to other unseen volumes, a task that the weak segmentation algorithm fails at completely. The proposed volume visualization uses deep-learning-based segmentation as a component for segmentation-aware transfer function design. Appropriate ramp parameters can be suggested automatically through frequency distribution analysis. Furthermore, our visualization uses gradient-free ambient occlusion shading to further suppress the visual presence of noise, and to give structural detail the desired prominence. The cryo-ET data studied in our technical experiments are based on the highest-quality tilted series of intact SARS-CoV-2 virions. Our technique shows the high impact in target sciences for visual data analysis of very noisy volumes that cannot be visualized with existing techniques.
Collapse
|
14
|
Cumbajin E, Rodrigues N, Costa P, Miragaia R, Frazão L, Costa N, Fernández-Caballero A, Carneiro J, Buruberri LH, Pereira A. A Systematic Review on Deep Learning with CNNs Applied to Surface Defect Detection. J Imaging 2023; 9:193. [PMID: 37888300 PMCID: PMC10607335 DOI: 10.3390/jimaging9100193] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/29/2023] [Accepted: 09/18/2023] [Indexed: 10/28/2023] Open
Abstract
Surface defect detection with machine learning has become an important tool in industries and a large field of study for researchers or workers in recent years. It is necessary to have a simplified source of information that helps us to better focus on one type of surface. In this systematic review, we present a classification for surface defect detection based on convolutional neural networks (CNNs) focused on surface types. Findings: Out of 253 records identified, 59 primary studies were eligible. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we analyzed the structures of each study and the concepts related to defects and their types on surfaces. The presented review is mainly focused on finding a classification for the types of surfaces most used in industry (metal, building, ceramic, wood, and special). We delve into the specifics of each surface category, offering illustrative examples of their applications within both industrial and laboratory settings. Furthermore, we propose a new taxonomy of machine learning based on the obtained results and collected information. We summarized the studies and extracted the main characteristics such as type of surface, problem types, timeline, type of network, techniques, and datasets. Among the most relevant results of our analysis, we found that the metallic surface is the most used, as it is the one found in 62.71% of the studies, and the most prevalent problem type is classification, accounting for 49.15% of the total. Furthermore, we observe that transfer learning was employed in 83.05% of the studies, while data augmentation was utilized in 59.32%. Our findings also provide insights into the cameras most frequently employed, along with the strategies adopted to address illumination challenges present in certain articles and the approach to creating datasets for real-world applications. The main results presented in this review allow for a quick and efficient search of information for researchers and professionals interested in improving the results of their defect detection projects. Finally, we analyzed the trends that could open new fields of study for future research in the area of surface defect detection.
Collapse
Affiliation(s)
- Esteban Cumbajin
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Nuno Rodrigues
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Paulo Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Rolando Miragaia
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Luís Frazão
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Nuno Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Antonio Fernández-Caballero
- Instituto de Investigación en Informática de Albacete, 02071 Albacete, Spain
- Departamento de Sistemas Informáticos, Universidad de Castilla-La Mancha, 02071 Albacete, Spain
| | - Jorge Carneiro
- Grestel-Produtos Cerâmicos S.A, Zona Industrial de Vagos-Lote 78, 3840-385 Vagos, Portugal; (J.C.); (L.H.B.)
| | - Leire H. Buruberri
- Grestel-Produtos Cerâmicos S.A, Zona Industrial de Vagos-Lote 78, 3840-385 Vagos, Portugal; (J.C.); (L.H.B.)
| | - António Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
- INOV INESC Inovação, Institute of New Technologies, Leiria Office, 2411-901 Leiria, Portugal
| |
Collapse
|
15
|
Li J, Zhou YQ, Zhang QY. Metric networks for enhanced perception of non-local semantic information. Front Neurorobot 2023; 17:1234129. [PMID: 37622128 PMCID: PMC10445135 DOI: 10.3389/fnbot.2023.1234129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 07/21/2023] [Indexed: 08/26/2023] Open
Abstract
Introduction Metric learning, as a fundamental research direction in the field of computer vision, has played a crucial role in image matching. Traditional metric learning methods aim at constructing two-branch siamese neural networks to address the challenge of image matching, but they often overlook to cross-source and cross-view scenarios. Methods In this article, a multi-branch metric learning model is proposed to address these limitations. The main contributions of this work are as follows: Firstly, we design a multi-branch siamese network model that enhances measurement reliability through information compensation among data points. Secondly, we construct a non-local information perception and fusion model, which accurately distinguishes positive and negative samples by fusing information at different scales. Thirdly, we enhance the model by integrating semantic information and establish an information consistency mapping between multiple branches, thereby improving the robustness in cross-source and cross-view scenarios. Results Experimental tests which demonstrate the effectiveness of the proposed method are carried out under various conditions, including homologous, heterogeneous, multi-view, and crossview scenarios. Compared to the state-of-the-art comparison algorithms, our proposed algorithm achieves an improvement of ~1, 2, 1, and 1% in terms of similarity measurement Recall@10, respectively, under these four conditions. Discussion In addition, our work provides an idea for improving the crossscene application ability of UAV positioning and navigation algorithm.
Collapse
Affiliation(s)
| | - Yu-qian Zhou
- College of Applied Mathematics, Chengdu University of Information Technology, Chengdu, Sichuan, China
| | | |
Collapse
|
16
|
Feng T, Guo Y, Huang X, Qiao Y. Cattle Target Segmentation Method in Multi-Scenes Using Improved DeepLabV3+ Method. Animals (Basel) 2023; 13:2521. [PMID: 37570328 PMCID: PMC10417518 DOI: 10.3390/ani13152521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 07/26/2023] [Accepted: 08/03/2023] [Indexed: 08/13/2023] Open
Abstract
Obtaining animal regions and the relative position relationship of animals in the scene is conducive to further studying animal habits, which is of great significance for smart animal farming. However, the complex breeding environment still makes detection difficult. To address the problems of poor target segmentation effects and the weak generalization ability of existing semantic segmentation models in complex scenes, a semantic segmentation model based on an improved DeepLabV3+ network (Imp-DeepLabV3+) was proposed. Firstly, the backbone network of the DeepLabV3+ model was replaced by MobileNetV2 to enhance the feature extraction capability of the model. Then, the layer-by-layer feature fusion method was adopted in the Decoder stage to integrate high-level semantic feature information with low-level high-resolution feature information at multi-scale to achieve more precise up-sampling operation. Finally, the SENet module was further introduced into the network to enhance information interaction after feature fusion and improve the segmentation precision of the model under complex datasets. The experimental results demonstrate that the Imp-DeepLabV3+ model achieved a high pixel accuracy (PA) of 99.4%, a mean pixel accuracy (MPA) of 98.1%, and a mean intersection over union (MIoU) of 96.8%. Compared to the original DeepLabV3+ model, the segmentation performance of the improved model significantly improved. Moreover, the overall segmentation performance of the Imp-DeepLabV3+ model surpassed that of other commonly used semantic segmentation models, such as Fully Convolutional Networks (FCNs), Lite Reduced Atrous Spatial Pyramid Pooling (LR-ASPP), and U-Net. Therefore, this study can be applied to the field of scene segmentation and is conducive to further analyzing individual information and promoting the development of intelligent animal farming.
Collapse
Affiliation(s)
- Tao Feng
- School of Internet, Anhui University, Hefei 230039, China; (T.F.); (Y.G.); (X.H.)
| | - Yangyang Guo
- School of Internet, Anhui University, Hefei 230039, China; (T.F.); (Y.G.); (X.H.)
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Hefei 230039, China
| | - Xiaoping Huang
- School of Internet, Anhui University, Hefei 230039, China; (T.F.); (Y.G.); (X.H.)
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Hefei 230039, China
| | - Yongliang Qiao
- Australian Institute for Machine Learning (AIML), The University of Adelaide, Adelaide 5005, Australia
| |
Collapse
|
17
|
Liu G, Wang Q, Zhu J, Hong H. W-Net: Convolutional neural network for segmenting remote sensing images by dual path semantics. PLoS One 2023; 18:e0288311. [PMID: 37498885 PMCID: PMC10374094 DOI: 10.1371/journal.pone.0288311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 06/25/2023] [Indexed: 07/29/2023] Open
Abstract
In the latest research progress, deep neural networks have been revolutionized by frameworks to extract image features more accurately. In this study, we focus on an attention model that can be useful in deep neural networks and propose a simple but strong feature extraction deep network architecture, W-Net. The architecture of our W-Net network has two mutually independent path structures, and it is designed with the following advantages. (1) There are two independent effective paths in our proposed network structure, and the two paths capture more contextual information from different scales in different ways. (2) The two paths acquire different feature images, and in the upsampling approach, we use bilinear interpolation thus reducing the feature map distortion phenomenon and integrating the different images processed. (3) The feature image processing is at a bottleneck, and a hierarchical attention module is constructed at the bottleneck by reclassifying after the channel attention module and the spatial attention module, resulting in more efficient and accurate processing of feature images. During the experiment, we also tested iSAID, a massively high spatial resolution remote sensing image dataset, with further experimental data comparison to demonstrate the generality of our method for remote sensor image segmentation.
Collapse
Affiliation(s)
- Guangjie Liu
- College of Computer Science and Technology, Changchun Normal University, Changchun, Jilin, China
| | - Qi Wang
- College of Computer Science and Technology, Changchun Normal University, Changchun, Jilin, China
| | - Jinlong Zhu
- College of Computer Science and Technology, Changchun Normal University, Changchun, Jilin, China
| | - Haotong Hong
- FAW Mold Manufacturing Co., Ltd, Changchun, Jilin, China
| |
Collapse
|
18
|
Song HJ, Park YJ, Jeong HY, Kim BG, Kim JH, Im YG. Detection of Abnormal Changes on the Dorsal Tongue Surface Using Deep Learning. MEDICINA (KAUNAS, LITHUANIA) 2023; 59:1293. [PMID: 37512104 PMCID: PMC10385577 DOI: 10.3390/medicina59071293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/09/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023]
Abstract
Background and Objective: The tongue mucosa often changes due to various local and systemic diseases or conditions. This study aimed to investigate whether deep learning can help detect abnormal regions on the dorsal tongue surface in patients and healthy adults. Materials and Methods: The study collected 175 clinical photographic images of the dorsal tongue surface, which were divided into 7782 cropped images classified into normal, abnormal, and non-tongue regions and trained using the VGG16 deep learning model. The 80 photographic images of the entire dorsal tongue surface were used for the segmentation of abnormal regions using point mapping segmentation. Results: The F1-scores of the abnormal and normal classes were 0.960 (precision: 0.935, recall: 0.986) and 0.968 (precision: 0.987, recall: 0.950), respectively, in the prediction of the VGG16 model. As a result of evaluation using point mapping segmentation, the average F1-scores were 0.727 (precision: 0.717, recall: 0.737) and 0.645 (precision: 0.650, recall: 0.641), the average intersection of union was 0.695 and 0.590, and the average precision was 0.940 and 0.890, respectively, for abnormal and normal classes. Conclusions: The deep learning algorithm used in this study can accurately determine abnormal areas on the dorsal tongue surface, which can assist in diagnosing specific diseases or conditions of the tongue mucosa.
Collapse
Affiliation(s)
- Ho-Jun Song
- Department of Dental Materials, Dental Science Research Institute, School of Dentistry, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Yeong-Joon Park
- Department of Dental Materials, Dental Science Research Institute, School of Dentistry, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Hie-Yong Jeong
- Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Byung-Gook Kim
- Department of Oral Medicine, Dental Science Research Institute, School of Dentistry, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Jae-Hyung Kim
- Department of Oral Medicine, Dental Science Research Institute, School of Dentistry, Chonnam National University, Gwangju 61186, Republic of Korea
| | - Yeong-Gwan Im
- Department of Oral Medicine, Dental Science Research Institute, School of Dentistry, Chonnam National University, Gwangju 61186, Republic of Korea
| |
Collapse
|
19
|
Yan T, Qin YY, Wong PK, Ren H, Wong CH, Yao L, Hu Y, Chan CI, Gao S, Chan PP. Semantic Segmentation of Gastric Polyps in Endoscopic Images Based on Convolutional Neural Networks and an Integrated Evaluation Approach. Bioengineering (Basel) 2023; 10:806. [PMID: 37508833 PMCID: PMC10376250 DOI: 10.3390/bioengineering10070806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 06/27/2023] [Accepted: 07/03/2023] [Indexed: 07/30/2023] Open
Abstract
Convolutional neural networks (CNNs) have received increased attention in endoscopic images due to their outstanding advantages. Clinically, some gastric polyps are related to gastric cancer, and accurate identification and timely removal are critical. CNN-based semantic segmentation can delineate each polyp region precisely, which is beneficial to endoscopists in the diagnosis and treatment of gastric polyps. At present, just a few studies have used CNN to automatically diagnose gastric polyps, and studies on their semantic segmentation are lacking. Therefore, we contribute pioneering research on gastric polyp segmentation in endoscopic images based on CNN. Seven classical semantic segmentation models, including U-Net, UNet++, DeepLabv3, DeepLabv3+, Pyramid Attention Network (PAN), LinkNet, and Muti-scale Attention Net (MA-Net), with the encoders of ResNet50, MobineNetV2, or EfficientNet-B1, are constructed and compared based on the collected dataset. The integrated evaluation approach to ascertaining the optimal CNN model combining both subjective considerations and objective information is proposed since the selection from several CNN models is difficult in a complex problem with conflicting multiple criteria. UNet++ with the MobineNet v2 encoder obtains the best scores in the proposed integrated evaluation method and is selected to build the automated polyp-segmentation system. This study discovered that the semantic segmentation model has a high clinical value in the diagnosis of gastric polyps, and the integrated evaluation approach can provide an impartial and objective tool for the selection of numerous models. Our study can further advance the development of endoscopic gastrointestinal disease identification techniques, and the proposed evaluation technique has implications for mathematical model-based selection methods for clinical technologies.
Collapse
Affiliation(s)
- Tao Yan
- School of Mechanical Engineering, Hubei University of Arts and Science, Xiangyang 441053, China
- Department of Electromechanical Engineering, University of Macau, Taipa, Macau 999078, China
- Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang 441021, China
| | - Ye Ying Qin
- Department of Electromechanical Engineering, University of Macau, Taipa, Macau 999078, China
| | - Pak Kin Wong
- Department of Electromechanical Engineering, University of Macau, Taipa, Macau 999078, China
| | - Hao Ren
- Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang 441021, China
| | - Chi Hong Wong
- Faculty of Medicine, Macau University of Science and Technology, Taipa, Macau 999078, China
| | - Liang Yao
- Department of Electromechanical Engineering, University of Macau, Taipa, Macau 999078, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ying Hu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Cheok I Chan
- School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Shan Gao
- Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang 441021, China
| | - Pui Pun Chan
- Department of General Surgery, Centro Hospitalar Conde de São Januário, Macau 999078, China
| |
Collapse
|
20
|
Gómez-Cárdenes Ó, Marichal-Hernández JG, Son JY, Pérez Jiménez R, Rodríguez-Ramos JM. An Encoder-Decoder Architecture within a Classical Signal-Processing Framework for Real-Time Barcode Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:6109. [PMID: 37447960 DOI: 10.3390/s23136109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 06/25/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023]
Abstract
In this work, two methods are proposed for solving the problem of one-dimensional barcode segmentation in images, with an emphasis on augmented reality (AR) applications. These methods take the partial discrete Radon transform as a building block. The first proposed method uses overlapping tiles for obtaining good angle precision while maintaining good spatial precision. The second one uses an encoder-decoder structure inspired by state-of-the-art convolutional neural networks for segmentation while maintaining a classical processing framework, thus not requiring training. It is shown that the second method's processing time is lower than the video acquisition time with a 1024 × 1024 input on a CPU, which had not been previously achieved. The accuracy it obtained on datasets widely used by the scientific community was almost on par with that obtained using the most-recent state-of-the-art methods using deep learning. Beyond the challenges of those datasets, the method proposed is particularly well suited to image sequences taken with short exposure and exhibiting motion blur and lens blur, which are expected in a real-world AR scenario. Two implementations of the proposed methods are made available to the scientific community: one for easy prototyping and one optimised for parallel implementation, which can be run on desktop and mobile phone CPUs.
Collapse
Affiliation(s)
- Óscar Gómez-Cárdenes
- Department of Industrial Engineering, Universidad de La Laguna, 38200 La Laguna, Spain
| | | | - Jung-Young Son
- Biomedical Engineering Department, Konyang University, Nonsan-si 320-711, Republic of Korea
| | - Rafael Pérez Jiménez
- Institute for Technological Development and Innovation in Communications, Universidad de Las Palmas de Gran Canaria, 35017 Las Palmas, Spain
| | - José Manuel Rodríguez-Ramos
- Department of Industrial Engineering, Universidad de La Laguna, 38200 La Laguna, Spain
- Research & Development Department, Wooptix S.L., 38204 La Laguna, Spain
| |
Collapse
|
21
|
Darooei R, Nazari M, Kafieh R, Rabbani H. Optimal Deep Learning Architecture for Automated Segmentation of Cysts in OCT Images Using X-Let Transforms. Diagnostics (Basel) 2023; 13:1994. [PMID: 37370889 PMCID: PMC10297540 DOI: 10.3390/diagnostics13121994] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 05/22/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open
Abstract
The retina is a thin, light-sensitive membrane with a multilayered structure found in the back of the eyeball. There are many types of retinal disorders. The two most prevalent retinal illnesses are Age-Related Macular Degeneration (AMD) and Diabetic Macular Edema (DME). Optical Coherence Tomography (OCT) is a vital retinal imaging technology. X-lets (such as curvelet, DTCWT, contourlet, etc.) have several benefits in image processing and analysis. They can capture both local and non-local features of an image simultaneously. The aim of this paper is to propose an optimal deep learning architecture based on sparse basis functions for the automated segmentation of cystic areas in OCT images. Different X-let transforms were used to produce different network inputs, including curvelet, Dual-Tree Complex Wavelet Transform (DTCWT), circlet, and contourlet. Additionally, three different combinations of these transforms are suggested to achieve more accurate segmentation results. Various metrics, including Dice coefficient, sensitivity, false positive ratio, Jaccard index, and qualitative results, were evaluated to find the optimal networks and combinations of the X-let's sub-bands. The proposed network was tested on both original and noisy datasets. The results show the following facts: (1) contourlet achieves the optimal results between different combinations; (2) the five-channel decomposition using high-pass sub-bands of contourlet transform achieves the best performance; and (3) the five-channel decomposition using high-pass sub-bands formations out-performs the state-of-the-art methods, especially in the noisy dataset. The proposed method has the potential to improve the accuracy and speed of the segmentation process in clinical settings, facilitating the diagnosis and treatment of retinal diseases.
Collapse
Affiliation(s)
- Reza Darooei
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran; (R.D.); (R.K.)
- Department of Bioelectrics and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran
| | - Milad Nazari
- Department of Molecular Biology and Genetics, Aarhus University, 8200 Aarhus, Denmark;
- The Danish Research Institute of Translational Neuroscience (DANDRITE), Aarhus University, 8200 Aarhus, Denmark
| | - Rahele Kafieh
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran; (R.D.); (R.K.)
- Department of Engineering, Durham University, South Road, Durham DH1 3RW, UK
| | - Hossein Rabbani
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran; (R.D.); (R.K.)
- Department of Bioelectrics and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 8174673461, Iran
| |
Collapse
|
22
|
Fan J, Zhang Z. Memory-Based Cross-Image Contexts for Weakly Supervised Semantic Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6006-6020. [PMID: 36049013 DOI: 10.1109/tpami.2022.3203402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Weakly supervised semantic segmentation (WSSS) trains segmentation models by only weak labels, aiming to save the burden of expensive pixel-level annotations. This paper tackles the WSSS problem of utilizing image-level labels as the weak supervision. Previous approaches address this problem by focusing on generating better pseudo-masks from weak labels to train the segmentation model. However, they generally only consider every single image and overlook the potential cross-image contexts. We emphasize that the cross-image contexts among a group of images can provide complementary information for each other to obtain better pseudo-masks. To effectively employ cross-image contexts, we develop an end-to-end cross-image context module containing a memory bank mechanism and a transformer-based cross-image attention module. The former extracts cross-image contexts online from the feature encodings of input images and stores them as the memory. The latter mines useful information from the memorized contexts to provide the original queries with additional information for better pseudo-mask generation. We conduct detailed experiments on the Pascal VOC 2012 and the COCO dataset to demonstrate the advantage of utilizing cross-image contexts. Besides, state-of-the-art performance is also achieved. Codes are available at https://github.com/js-fan/MCIC.git.
Collapse
|
23
|
Najafian K, Ghanbari A, Sabet Kish M, Eramian M, Shirdel GH, Stavness I, Jin L, Maleki F. Semi-Self-Supervised Learning for Semantic Segmentation in Images with Dense Patterns. PLANT PHENOMICS (WASHINGTON, D.C.) 2023; 5:0025. [PMID: 36930764 PMCID: PMC10013790 DOI: 10.34133/plantphenomics.0025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
Deep learning has shown potential in domains with large-scale annotated datasets. However, manual annotation is expensive, time-consuming, and tedious. Pixel-level annotations are particularly costly for semantic segmentation in images with dense irregular patterns of object instances, such as in plant images. In this work, we propose a method for developing high-performing deep learning models for semantic segmentation of such images utilizing little manual annotation. As a use case, we focus on wheat head segmentation. We synthesize a computationally annotated dataset-using a few annotated images, a short unannotated video clip of a wheat field, and several video clips with no wheat-to train a customized U-Net model. Considering the distribution shift between the synthesized and real images, we apply three domain adaptation steps to gradually bridge the domain gap. Only using two annotated images, we achieved a Dice score of 0.89 on the internal test set. When further evaluated on a diverse external dataset collected from 18 different domains across five countries, this model achieved a Dice score of 0.73. To expose the model to images from different growth stages and environmental conditions, we incorporated two annotated images from each of the 18 domains to further fine-tune the model. This increased the Dice score to 0.91. The result highlights the utility of the proposed approach in the absence of large-annotated datasets. Although our use case is wheat head segmentation, the proposed approach can be extended to other segmentation tasks with similar characteristics of irregularly repeating patterns of object instances.
Collapse
Affiliation(s)
- Keyhan Najafian
- Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Alireza Ghanbari
- Mathematics Department, Faculty of Sciences, University of Qom, Qom, Iran
| | - Mahdi Sabet Kish
- Department of Mathematics, Faculty of Mathematical Science, Shahid Beheshti University, Tehran, Iran
| | - Mark Eramian
- Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | | | - Ian Stavness
- Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Lingling Jin
- Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Farhad Maleki
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
24
|
Eldem H, Ülker E, Yaşar Işıklı O. Encoder–decoder semantic segmentation models for pressure wound images. THE IMAGING SCIENCE JOURNAL 2023. [DOI: 10.1080/13682199.2022.2163531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Hüseyin Eldem
- Vocational School of Technical Sciences, Computer Technologies Department, Karamanoğlu Mehmetbey University, Karaman, Turkey
| | - Erkan Ülker
- Faculty of Engineering and Natural Sciences, Department of Computer Engineering, Konya Technical University, Konya, Turkey
| | - Osman Yaşar Işıklı
- Karaman Education and Research Hospital, Vascular Surgery Department, Karaman, Turkey
| |
Collapse
|
25
|
CLC-Net: Contextual and Local Collaborative Network for Lesion Segmentation in Diabetic Retinopathy Images. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
26
|
AI meets UAVs: A survey on AI empowered UAV perception systems for precision agriculture. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.11.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
27
|
Zhu Y, Wang M, Yin X, Zhang J, Meijering E, Hu J. Deep Learning in Diverse Intelligent Sensor Based Systems. SENSORS (BASEL, SWITZERLAND) 2022; 23:62. [PMID: 36616657 PMCID: PMC9823653 DOI: 10.3390/s23010062] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 12/06/2022] [Accepted: 12/14/2022] [Indexed: 05/27/2023]
Abstract
Deep learning has become a predominant method for solving data analysis problems in virtually all fields of science and engineering. The increasing complexity and the large volume of data collected by diverse sensor systems have spurred the development of deep learning methods and have fundamentally transformed the way the data are acquired, processed, analyzed, and interpreted. With the rapid development of deep learning technology and its ever-increasing range of successful applications across diverse sensor systems, there is an urgent need to provide a comprehensive investigation of deep learning in this domain from a holistic view. This survey paper aims to contribute to this by systematically investigating deep learning models/methods and their applications across diverse sensor systems. It also provides a comprehensive summary of deep learning implementation tips and links to tutorials, open-source codes, and pretrained models, which can serve as an excellent self-contained reference for deep learning practitioners and those seeking to innovate deep learning in this space. In addition, this paper provides insights into research topics in diverse sensor systems where deep learning has not yet been well-developed, and highlights challenges and future opportunities. This survey serves as a catalyst to accelerate the application and transformation of deep learning in diverse sensor systems.
Collapse
Affiliation(s)
- Yanming Zhu
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Min Wang
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| | - Xuefei Yin
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| | - Jue Zhang
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Jiankun Hu
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| |
Collapse
|
28
|
Malik OA, Puasa I, Lai DTC. Segmentation for Multi-Rock Types on Digital Outcrop Photographs Using Deep Learning Techniques. SENSORS (BASEL, SWITZERLAND) 2022; 22:8086. [PMID: 36365784 PMCID: PMC9654682 DOI: 10.3390/s22218086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 10/11/2022] [Accepted: 10/17/2022] [Indexed: 06/16/2023]
Abstract
The basic identification and classification of sedimentary rocks into sandstone and mudstone are important in the study of sedimentology and they are executed by a sedimentologist. However, such manual activity involves countless hours of observation and data collection prior to any interpretation. When such activity is conducted in the field as part of an outcrop study, the sedimentologist is likely to be exposed to challenging conditions such as the weather and their accessibility to the outcrops. This study uses high-resolution photographs which are acquired from a sedimentological study to test an alternative basic multi-rock identification through machine learning. While existing studies have effectively applied deep learning techniques to classify the rock types in field rock images, their approaches only handle a single rock-type classification per image. One study applied deep learning techniques to classify multi-rock types in each image; however, the test was performed on artificially overlaid images of different rock types in a test sample and not of naturally occurring rock surfaces of multiple rock types. To the best of our knowledge, no study has applied semantic segmentation to solve the multi-rock classification problem using digital photographs of multiple rock types. This paper presents the application of two state-of-the-art segmentation models, namely U-Net and LinkNet, to identify multiple rock types in digital photographs by segmenting the sandstone, mudstone, and background classes in a self-collected dataset of 102 images from a field in Brunei Darussalam. Four pre-trained networks, including Resnet34, Inceptionv3, VGG16, and Efficientnetb7 were used as a backbone for both models, and the performances of the individual models and their ensembles were compared. We also investigated the impact of image enhancement and different color representations on the performances of these segmentation models. The experiment results of this study show that among the individual models, LinkNet with Efficientnetb7 as a backbone had the best performance with a mean over intersection (MIoU) value of 0.8135 for all of the classes. While the ensemble of U-Net models (with all four backbones) performed slightly better than the LinkNet with Efficientnetb7 did with an MIoU of 0.8201. When different color representations and image enhancements were explored, the best performance (MIoU = 0.8178) was noticed for the L*a*b* color representation with Efficientnetb7 using U-Net segmentation. For the individual classes of interest (sandstone and mudstone), U-Net with Efficientnetb7 was found to be the best model for the segmentation. Thus, this study presents the potential of semantic segmentation in automating the reservoir characterization process whereby we can extract the patches of interest from the rocks for much deeper study and modeling to be conducted.
Collapse
Affiliation(s)
- Owais A. Malik
- School of Digital Science, Universiti Brunei Darussalam, Brunei Darussalam, Gadong BE1410, Brunei
- Institute of Applied Data Analytics, Universiti Brunei Darussalam, Brunei Darussalam, Gadong BE1410, Brunei
| | - Idrus Puasa
- Brunei Shell Petroleum, Brunei Darussalam, Panaga KB2933, Brunei
| | - Daphne Teck Ching Lai
- School of Digital Science, Universiti Brunei Darussalam, Brunei Darussalam, Gadong BE1410, Brunei
- Institute of Applied Data Analytics, Universiti Brunei Darussalam, Brunei Darussalam, Gadong BE1410, Brunei
| |
Collapse
|
29
|
Zhao S, Wang Y, Tian K. Using AAEHS-Net as an Attention-Based Auxiliary Extraction and Hybrid Subsampled Network for Semantic Segmentation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1536976. [PMID: 36275973 PMCID: PMC9586756 DOI: 10.1155/2022/1536976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 10/03/2022] [Indexed: 11/17/2022]
Abstract
Semantic segmentation based on deep learning has undergone remarkable advancements in recent years. However, due to the neglect of the shallow features, the problems of inaccurate segmentation have persisted. To address this issue, a semantic segmentation network-attention-based auxiliary extraction and hybrid subsampled network (AAEHS-Net) is suggested in this study. To extract more deep information and the shallow features, the complementary and enhanced extraction module (CEEM) is utilized by the network. As a result, the edge segmentation of the model is improved. Moreover, to reduce the loss of features, a hybrid subsampled module (HSM) is introduced. Meanwhile, global max pool and global avg pool module (GAGM) is designed as an attention module to enhance the features with global and important information and maintain feature continuity. The proposed AAEHS-Net is evaluated on three datasets: the aerial drone image dataset, the Massachusetts roads dataset, and the Massachusetts buildings dataset. On the three datasets, AAEHS-Net achieves 1.15%, 0.88%, and 2.1% higher accuracy than U-Net, reaching 90.12%, 96.23%, and 95.15%, respectively. At the same time, our proposed network has obtained the best values for all evaluation metrics in three datasets compared to the currently popular algorithms.
Collapse
Affiliation(s)
- Shan Zhao
- School of Software, Henan Polytechnic University, Jiaozuo 454003, China
| | - Yibo Wang
- School of Software, Henan Polytechnic University, Jiaozuo 454003, China
| | - Kaiwen Tian
- School of Software, Henan Polytechnic University, Jiaozuo 454003, China
| |
Collapse
|
30
|
Jia XZ, DongYe CL, Peng YJ, Zhao WX, Liu TD. MRBENet: A Multiresolution Boundary Enhancement Network for Salient Object Detection. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7780756. [PMID: 36262601 PMCID: PMC9576351 DOI: 10.1155/2022/7780756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/08/2022] [Accepted: 09/24/2022] [Indexed: 11/17/2022]
Abstract
Salient Object Detection (SOD) simulates the human visual perception in locating the most attractive objects in the images. Existing methods based on convolutional neural networks have proven to be highly effective for SOD. However, in some cases, these methods cannot satisfy the need of both accurately detecting intact objects and maintaining their boundary details. In this paper, we present a Multiresolution Boundary Enhancement Network (MRBENet) that exploits edge features to optimize the location and boundary fineness of salient objects. We incorporate a deeper convolutional layer into the backbone network to extract high-level semantic features and indicate the location of salient objects. Edge features of different resolutions are extracted by a U-shaped network. We designed a Feature Fusion Module (FFM) to fuse edge features and salient features. Feature Aggregation Module (FAM) based on spatial attention performs multiscale convolutions to enhance salient features. The FFM and FAM allow the model to accurately locate salient objects and enhance boundary fineness. Extensive experiments on six benchmark datasets demonstrate that the proposed method is highly effective and improves the accuracy of salient object detection compared with state-of-the-art methods.
Collapse
Affiliation(s)
- Xing-Zhao Jia
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
| | - Chang-Lei DongYe
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
| | - Yan-Jun Peng
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
| | - Wen-Xiu Zhao
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
| | - Tian-De Liu
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
| |
Collapse
|
31
|
Divyanth LG, Marzougui A, González-Bernal MJ, McGee RJ, Rubiales D, Sankaran S. Evaluation of Effective Class-Balancing Techniques for CNN-Based Assessment of Aphanomyces Root Rot Resistance in Pea ( Pisum sativum L.). SENSORS (BASEL, SWITZERLAND) 2022; 22:7237. [PMID: 36236336 PMCID: PMC9572822 DOI: 10.3390/s22197237] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/15/2022] [Accepted: 09/16/2022] [Indexed: 06/16/2023]
Abstract
Aphanomyces root rot (ARR) is a devastating disease that affects the production of pea. The plants are prone to infection at any growth stage, and there are no chemical or cultural controls. Thus, the development of resistant pea cultivars is important. Phenomics technologies to support the selection of resistant cultivars through phenotyping can be valuable. One such approach is to couple imaging technologies with deep learning algorithms that are considered efficient for the assessment of disease resistance across a large number of plant genotypes. In this study, the resistance to ARR was evaluated through a CNN-based assessment of pea root images. The proposed model, DeepARRNet, was designed to classify the pea root images into three classes based on ARR severity scores, namely, resistant, intermediate, and susceptible classes. The dataset consisted of 1581 pea root images with a skewed distribution. Hence, three effective data-balancing techniques were identified to solve the prevalent problem of unbalanced datasets. Random oversampling with image transformations, generative adversarial network (GAN)-based image synthesis, and loss function with class-weighted ratio were implemented during the training process. The result indicated that the classification F1-score was 0.92 ± 0.03 when GAN-synthesized images were added, 0.91 ± 0.04 for random resampling, and 0.88 ± 0.05 when class-weighted loss function was implemented, which was higher than when an unbalanced dataset without these techniques were used (0.83 ± 0.03). The systematic approaches evaluated in this study can be applied to other image-based phenotyping datasets, which can aid the development of deep-learning models with improved performance.
Collapse
Affiliation(s)
- L. G. Divyanth
- Department of Biological Systems Engineering, Washington State University, Pullman, WA 99164, USA
- Department of Agricultural and Food Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Afef Marzougui
- Department of Biological Systems Engineering, Washington State University, Pullman, WA 99164, USA
| | | | - Rebecca J. McGee
- Grain Legume Genetics and Physiology Research Unit, US Department of Agriculture-Agricultural Research Service (USDA-ARS), Pullman, WA 99164, USA
| | - Diego Rubiales
- The Institute for Sustainable Agriculture, Spanish National Research Council, 14001 Cordova, Spain
| | - Sindhuja Sankaran
- Department of Biological Systems Engineering, Washington State University, Pullman, WA 99164, USA
| |
Collapse
|
32
|
Chen HC, Xu SY, Deng KH. Water Color Identification System for Monitoring Aquaculture Farms. SENSORS (BASEL, SWITZERLAND) 2022; 22:7131. [PMID: 36236230 PMCID: PMC9571723 DOI: 10.3390/s22197131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/12/2022] [Accepted: 09/16/2022] [Indexed: 06/16/2023]
Abstract
This study presents a vision-based water color identification system designed for monitoring aquaculture ponds. The algorithm proposed in this system can identify water color, which is an important factor in aquaculture farming management. To address the effect of outdoor lighting conditions on the proposed system, a color correction method using a color checkerboard was introduced. Several candidates for water-only image patches were extracted by performing image segmentation and fuzzy inferencing. Finally, a deep learning-based model was employed to identify the color of these patches and then find the representative color of the water. Experiments at different aquaculture sites verified the effectiveness of the proposed system and its algorithm. The color identification accuracy exceeded 96% for the test data.
Collapse
|
33
|
|
34
|
Xu W, Song H, Jin Y, Yan F. Video Super-Resolution with Frame-Wise Dynamic Fusion and Self-Calibrated Deformable Alignment. Neural Process Lett 2022. [DOI: 10.1007/s11063-021-10593-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
35
|
Self-Adaptive Clustering of Dynamic Multi-Graph Learning. Neural Process Lett 2022. [DOI: 10.1007/s11063-020-10405-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
36
|
Semi-supervised Learning with Graph Convolutional Networks Based on Hypergraph. Neural Process Lett 2022. [DOI: 10.1007/s11063-021-10487-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
37
|
Research for an Adaptive Classifier Based on Dynamic Graph Learning. Neural Process Lett 2022. [DOI: 10.1007/s11063-021-10452-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
38
|
Single Image Deraining by Fully Exploiting Contextual Information. Neural Process Lett 2022. [DOI: 10.1007/s11063-021-10486-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
39
|
CAMA: Class activation mapping disruptive attack for deep neural networks. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
40
|
Deep Segmentation Networks for Segmenting Kidneys and Detecting Kidney Stones in Unenhanced Abdominal CT Images. Diagnostics (Basel) 2022; 12:diagnostics12081788. [PMID: 35892498 PMCID: PMC9330428 DOI: 10.3390/diagnostics12081788] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 07/20/2022] [Accepted: 07/20/2022] [Indexed: 11/17/2022] Open
Abstract
Recent breakthroughs of deep learning algorithms in medical imaging, automated detection, and segmentation techniques for renal (kidney) in abdominal computed tomography (CT) images have been limited. Radiomics and machine learning analyses of renal diseases rely on the automatic segmentation of kidneys in CT images. Inspired by this, our primary aim is to utilize deep semantic segmentation learning models with a proposed training scheme to achieve precise and accurate segmentation outcomes. Moreover, this work aims to provide the community with an open-source, unenhanced abdominal CT dataset for training and testing the deep learning segmentation networks to segment kidneys and detect kidney stones. Five variations of deep segmentation networks are trained and tested both dependently (based on the proposed training scheme) and independently. Upon comparison, the models trained with the proposed training scheme enable the highly accurate 2D and 3D segmentation of kidneys and kidney stones. We believe this work is a fundamental step toward AI-driven diagnostic strategies, which can be an essential component of personalized patient care and improved decision-making in treating kidney diseases.
Collapse
|
41
|
Guan T, Kothandaraman D, Chandra R, Sathyamoorthy AJ, Weerakoon K, Manocha D. GA-Nav: Efficient Terrain Segmentation for Robot Navigation in Unstructured Outdoor Environments. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3187278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Affiliation(s)
- Tianrui Guan
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Divya Kothandaraman
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Rohan Chandra
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | | | - Kasun Weerakoon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA
| | - Dinesh Manocha
- Department of Computer Science, University of Maryland, College Park, MD, USA
| |
Collapse
|
42
|
The study of coal gangue segmentation for location and shape predicts based on multispectral and improved Mask R-CNN. POWDER TECHNOL 2022. [DOI: 10.1016/j.powtec.2022.117655] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
43
|
Object Detection and Distance Measurement in Teleoperation. MACHINES 2022. [DOI: 10.3390/machines10050402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In recent years, teleoperation has experienced rapid development. Numerous teleoperation applications in diverse areas have been reported. Among all teleoperation-related components, computer vision (CV) is treated as one of the must-have technologies, because it allows users to observe remote scenarios. In addition, CV can further help the user to identify and track the desired targets from complex scenes. It has been proven that efficient CV methods can significantly improve the operation accuracy and relieve user’s physical and mental fatigue. Therefore, furthering understanding about CV techniques and reviewing the latest research outcomes is necessary for teleoperation designers. In this context, this review article was composed.
Collapse
|
44
|
CSAUNet: A cascade self-attention u-shaped network for precise fundus vessel segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
45
|
Evaluation of Chinese Natural Language Processing System Based on Metamorphic Testing. MATHEMATICS 2022. [DOI: 10.3390/math10081276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A natural language processing system can realize effective communication between human and computer with natural language. Because its evaluation method relies on a large amount of labeled data and human judgment, the question of how to systematically evaluate its quality is still a challenging task. In this article, we use metamorphic testing technology to evaluate natural language processing systems from the user’s perspective to help users better understand the functionalities of these systems and then select the appropriate natural language processing system according to their specific needs. We have defined three metamorphic relation patterns. These metamorphic relation patterns respectively focus on some characteristics of different aspects of natural language processing. Moreover, on this basis, we defined seven metamorphic relations and chose three tasks (text similarity, text summarization, and text classification) to evaluate the quality of the system. Chinese is used as target language. We extended the defined abstract metamorphic relations to these tasks, and seven specific metamorphic relations were generated for each task. Then, we judged whether the metamorphic relations were satisfied for each task, and used them to evaluate the quality and robustness of the natural language processing system without reference output. We further applied the metamorphic test to three mainstream natural language processing systems (including BaiduCloud API, AliCloud API, and TencentCloud API), and on the PWAS-X datasets, LCSTS datasets, and THUCNews datasets. Experiments were carried out, revealing the advantages and disadvantages of each system. These results further show that the metamorphic test can effectively test the natural language processing system without annotated data.
Collapse
|
46
|
CCAFFMNet: Dual-spectral semantic segmentation network with channel-coordinate attention feature fusion module. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.11.056] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
47
|
Rahman QM, Sunderhauf N, Corke P, Dayoub F. FSNet: A Failure Detection Framework for Semantic Segmentation. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3143219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
48
|
Improving Semantic Segmentation of Urban Scenes for Self-Driving Cars with Synthetic Images. SENSORS 2022; 22:s22062252. [PMID: 35336422 PMCID: PMC8955070 DOI: 10.3390/s22062252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/06/2022] [Accepted: 03/11/2022] [Indexed: 12/04/2022]
Abstract
Semantic segmentation of an incoming visual stream from cameras is an essential part of the perception system of self-driving cars. State-of-the-art results in semantic segmentation have been achieved with deep neural networks (DNNs), yet training them requires large datasets, which are difficult and costly to acquire and time-consuming to label. A viable alternative to training DNNs solely on real-world datasets is to augment them with synthetic images, which can be easily modified and generated in large numbers. In the present study, we aim at improving the accuracy of semantic segmentation of urban scenes by augmenting the Cityscapes real-world dataset with synthetic images generated with the open-source driving simulator CARLA (Car Learning to Act). Augmentation with synthetic images with a low degree of photorealism from the MICC-SRI (Media Integration and Communication Center–Semantic Road Inpainting) dataset does not result in the improvement of the accuracy of semantic segmentation, yet both MobileNetV2 and Xception DNNs used in the present study demonstrate a better accuracy after training on the custom-made CCM (Cityscapes-CARLA Mixed) dataset, which contains both real-world Cityscapes images and high-resolution synthetic images generated with CARLA, than after training only on the real-world Cityscapes images. However, the accuracy of semantic segmentation does not improve proportionally to the amount of the synthetic data used for augmentation, which indicates that augmentation with a larger amount of synthetic data is not always better.
Collapse
|
49
|
Treder KP, Huang C, Kim JS, Kirkland AI. Applications of deep learning in electron microscopy. Microscopy (Oxf) 2022; 71:i100-i115. [DOI: 10.1093/jmicro/dfab043] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 08/30/2021] [Accepted: 11/08/2021] [Indexed: 12/25/2022] Open
Abstract
Abstract
We review the growing use of machine learning in electron microscopy (EM) driven in part by the availability of fast detectors operating at kiloHertz frame rates leading to large data sets that cannot be processed using manually implemented algorithms. We summarize the various network architectures and error metrics that have been applied to a range of EM-related problems including denoising and inpainting. We then provide a review of the application of these in both physical and life sciences, highlighting how conventional networks and training data have been specifically modified for EM.
Collapse
Affiliation(s)
- Kevin P Treder
- Department of Materials, University of Oxford, Oxford, Oxfordshire OX1 3PH, UK
| | - Chen Huang
- Rosalind Franklin Institute, Harwell Research Campus, Didcot, Oxfordshire OX11 0FA, UK
| | - Judy S Kim
- Department of Materials, University of Oxford, Oxford, Oxfordshire OX1 3PH, UK
- Rosalind Franklin Institute, Harwell Research Campus, Didcot, Oxfordshire OX11 0FA, UK
| | - Angus I Kirkland
- Department of Materials, University of Oxford, Oxford, Oxfordshire OX1 3PH, UK
- Rosalind Franklin Institute, Harwell Research Campus, Didcot, Oxfordshire OX11 0FA, UK
| |
Collapse
|
50
|
Mapping of Dwellings in IDP/Refugee Settlements from Very High-Resolution Satellite Imagery Using a Mask Region-Based Convolutional Neural Network. REMOTE SENSING 2022. [DOI: 10.3390/rs14030689] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Earth-observation-based mapping plays a critical role in humanitarian responses by providing timely and accurate information in inaccessible areas, or in situations where frequent updates and monitoring are required, such as in internally displaced population (IDP)/refugee settlements. Manual information extraction pipelines are slow and resource inefficient. Advances in deep learning, especially convolutional neural networks (CNNs), are providing state-of-the-art possibilities for automation in information extraction. This study investigates a deep convolutional neural network-based Mask R-CNN model for dwelling extractions in IDP/refugee settlements. The study uses a time series of very high-resolution satellite images from WorldView-2 and WorldView-3. The model was trained with transfer learning through domain adaptation from nonremote sensing tasks. The capability of a model trained on historical images to detect dwelling features on completely unseen newly obtained images through temporal transfer was investigated. The results show that transfer learning provides better performance than training the model from scratch, with an MIoU range of 4.5 to 15.3%, and a range of 18.6 to 25.6% for the overall quality of the extracted dwellings, which varied on the bases of the source of the pretrained weight and the input image. Once it was trained on historical images, the model achieved 62.9, 89.3, and 77% for the object-based mean intersection over union (MIoU), completeness, and quality metrics, respectively, on completely unseen images.
Collapse
|