Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang D, Han J, Cheng G, Yang MH. Weakly Supervised Object Localization and Detection: A Survey. IEEE Trans Pattern Anal Mach Intell 2022;44:5866-5885. [PMID: 33877967 DOI: 10.1109/tpami.2021.3074313] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

For:	Zhang D, Han J, Cheng G, Yang MH. Weakly Supervised Object Localization and Detection: A Survey. IEEE Trans Pattern Anal Mach Intell 2022;44:5866-5885. [PMID: 33877967 DOI: 10.1109/tpami.2021.3074313] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Number

Cited by Other Article(s)

Kohler M, Eisenbach M, Gross HM. Few-Shot Object Detection: A Comprehensive Survey. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:11958-11978. [PMID: 37067965 DOI: 10.1109/tnnls.2023.3265051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Li G, Cheng D, Ding X, Wang N, Li J, Gao X. Weakly Supervised Temporal Action Localization With Bidirectional Semantic Consistency Constraint. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:13032-13045. [PMID: 37134038 DOI: 10.1109/tnnls.2023.3266062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Abstract

Weakly supervised temporal action localization (WTAL) aims to classify and localize temporal boundaries of actions for the video, given only video-level category labels in the training datasets. Due to the lack of boundary information during training, existing approaches formulate WTAL as a classification problem, i.e., generating the temporal class activation map (T-CAM) for localization. However, with only classification loss, the model would be suboptimized, i.e., the action-related scenes are enough to distinguish different class labels. Regarding other actions in the action-related scene (i.e., the scene same as positive actions) as co-scene actions, this suboptimized model would misclassify the co-scene actions as positive actions. To address this misclassification, we propose a simple yet efficient method, named bidirectional semantic consistency constraint (Bi-SCC), to discriminate the positive actions from co-scene actions. The proposed Bi-SCC first adopts a temporal context augmentation to generate an augmented video that breaks the correlation between positive actions and their co-scene actions in the inter-video. Then, a semantic consistency constraint (SCC) is used to enforce the predictions of the original video and augmented video to be consistent, hence suppressing the co-scene actions. However, we find that this augmented video would destroy the original temporal context. Simply applying the consistency constraint would affect the completeness of localized positive actions. Hence, we boost the SCC in a bidirectional way to suppress co-scene actions while ensuring the integrity of positive actions, by cross-supervising the original and augmented videos. Finally, our proposed Bi-SCC can be applied to current WTAL approaches and improve their performance. Experimental results show that our approach outperforms the state-of-the-art methods on THUMOS14 and ActivityNet. The code is available at https://github.com/lgzlIlIlI/BiSCC.

Collapse

Ji Z, An P, Liu X, Gao C, Pang Y, Shao L. Semantic-Aware Dynamic Generation Networks for Few-Shot Human-Object Interaction Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:12564-12575. [PMID: 37037250 DOI: 10.1109/tnnls.2023.3263660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Qin H, Cai M, Qin H. NABNet: Deep Learning-Based IoT Alert System for Detection of Abnormal Neck Behavior. SENSORS (BASEL, SWITZERLAND) 2024;24:5379. [PMID: 39205072 PMCID: PMC11360098 DOI: 10.3390/s24165379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 08/10/2024] [Accepted: 08/15/2024] [Indexed: 09/04/2024]

Wang J, Qiao L, Zhou S, Zhou J, Wang J, Li J, Ying S, Chang C, Shi J. Weakly Supervised Lesion Detection and Diagnosis for Breast Cancers With Partially Annotated Ultrasound Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024;43:2509-2521. [PMID: 38373131 DOI: 10.1109/tmi.2024.3366940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]

Abstract

Deep learning (DL) has proven highly effective for ultrasound-based computer-aided diagnosis (CAD) of breast cancers. In an automatic CAD system, lesion detection is critical for the following diagnosis. However, existing DL-based methods generally require voluminous manually-annotated region of interest (ROI) labels and class labels to train both the lesion detection and diagnosis models. In clinical practice, the ROI labels, i.e. ground truths, may not always be optimal for the classification task due to individual experience of sonologists, resulting in the issue of coarse annotation to limit the diagnosis performance of a CAD model. To address this issue, a novel Two-Stage Detection and Diagnosis Network (TSDDNet) is proposed based on weakly supervised learning to improve diagnostic accuracy of the ultrasound-based CAD for breast cancers. In particular, all the initial ROI-level labels are considered as coarse annotations before model training. In the first training stage, a candidate selection mechanism is then designed to refine manual ROIs in the fully annotated images and generate accurate pseudo-ROIs for the partially annotated images under the guidance of class labels. The training set is updated with more accurate ROI labels for the second training stage. A fusion network is developed to integrate detection network and classification network into a unified end-to-end framework as the final CAD model in the second training stage. A self-distillation strategy is designed on this model for joint optimization to further improves its diagnosis performance. The proposed TSDDNet is evaluated on three B-mode ultrasound datasets, and the experimental results indicate that it achieves the best performance on both lesion detection and diagnosis tasks, suggesting promising application potential.

Collapse

Shi J, Zhang K, Guo C, Yang Y, Xu Y, Wu J. A survey of label-noise deep learning for medical image analysis. Med Image Anal 2024;95:103166. [PMID: 38613918 DOI: 10.1016/j.media.2024.103166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 04/04/2024] [Accepted: 04/05/2024] [Indexed: 04/15/2024]

Su L, Fei L, Zhang B, Zhao S, Wen J, Xu Y. Complete Region of Interest for Unconstrained Palmprint Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024;33:3662-3675. [PMID: 38837937 DOI: 10.1109/tip.2024.3407666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]

Lin Y, Wang Z, Zhang D, Cheng KT, Chen H. BoNuS: Boundary Mining for Nuclei Segmentation With Partial Point Labels. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024;43:2137-2147. [PMID: 38231818 DOI: 10.1109/tmi.2024.3355068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]

Wan Y, Zhong Y, Ma A, Wang J, Zhang L. E2SCNet: Efficient Multiobjective Evolutionary Automatic Search for Remote Sensing Image Scene Classification Network Architecture. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:7752-7766. [PMID: 36395135 DOI: 10.1109/tnnls.2022.3220699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]

Tan D, Huang Z, Peng X, Zhong W, Mahalec V. Deep Adaptive Fuzzy Clustering for Evolutionary Unsupervised Representation Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:6103-6117. [PMID: 37027776 DOI: 10.1109/tnnls.2023.3243666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Liang Y, Zhu L, Wang X, Yang Y. Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:7048-7059. [PMID: 36409807 DOI: 10.1109/tnnls.2022.3213563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]

Chai Z, Luo L, Lin H, Heng PA, Chen H. Deep Omni-Supervised Learning for Rib Fracture Detection From Chest Radiology Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024;43:1972-1982. [PMID: 38215335 DOI: 10.1109/tmi.2024.3353248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2024]

Abstract

Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires a huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely infeasible. This poses a pressing need for developing label-efficient detection models to alleviate radiologists' labeling burden. To tackle this challenge, the literature on object detection has witnessed an increase of weakly-supervised and semi-supervised approaches, yet still lacks a unified framework that leverages various forms of fully-labeled, weakly-labeled, and unlabeled data. In this paper, we present a novel omni-supervised object detection network, ORF-Netv2, to leverage as much available supervision as possible. Specifically, a multi-branch omni-supervised detection head is introduced with each branch trained with a specific type of supervision. A co-training-based dynamic label assignment strategy is then proposed to enable flexible and robust learning from the weakly-labeled and unlabeled data. Extensive evaluation was conducted for the proposed framework with three rib fracture datasets on both chest CT and X-ray. By leveraging all forms of supervision, ORF-Netv2 achieves mAPs of 34.7, 44.7, and 19.4 on the three datasets, respectively, surpassing the baseline detector which uses only box annotations by mAP gains of 3.8, 4.8, and 5.0, respectively. Furthermore, ORF-Netv2 consistently outperforms other competitive label-efficient methods over various scenarios, showing a promising framework for label-efficient fracture detection. The code is available at: https://github.com/zhizhongchai/ORF-Net.

Collapse

Zhang D, Guo G, Zeng W, Li L, Han J. Generalized Weakly Supervised Object Localization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:5395-5406. [PMID: 36129872 DOI: 10.1109/tnnls.2022.3204337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Li H, Cao J, You K, Zhang Y, Ye J. Artificial intelligence-assisted management of retinal detachment from ultra-widefield fundus images based on weakly-supervised approach. Front Med (Lausanne) 2024;11:1326004. [PMID: 38379556 PMCID: PMC10876892 DOI: 10.3389/fmed.2024.1326004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 01/19/2024] [Indexed: 02/22/2024] Open

Bakouri M, Alyami N, Alassaf A, Waly M, Alqahtani T, AlMohimeed I, Alqahtani A, Samsuzzaman M, Ismail HF, Alharbi Y. Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation. SENSORS (BASEL, SWITZERLAND) 2023;23:4033. [PMID: 37112374 PMCID: PMC10145617 DOI: 10.3390/s23084033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 04/04/2023] [Accepted: 04/14/2023] [Indexed: 06/19/2023]

Li K, Qian Z, Han Y, Chang EIC, Wei B, Lai M, Liao J, Fan Y, Xu Y. Weakly supervised histopathology image segmentation with self-attention. Med Image Anal 2023;86:102791. [PMID: 36933385 DOI: 10.1016/j.media.2023.102791] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 01/09/2023] [Accepted: 02/24/2023] [Indexed: 03/13/2023]

Affiliation(s)

Kailu Li School of Biological Science and Medical Engineering, State Key Laboratory of Software Development Environment, Key Laboratory of Biomechanics, Mechanobiology of Ministry of Education and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100191, China.
Ziniu Qian School of Biological Science and Medical Engineering, State Key Laboratory of Software Development Environment, Key Laboratory of Biomechanics, Mechanobiology of Ministry of Education and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100191, China.
Yingnan Han School of Biological Science and Medical Engineering, State Key Laboratory of Software Development Environment, Key Laboratory of Biomechanics, Mechanobiology of Ministry of Education and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100191, China.
Eric I-Chao Chang Microsoft Research, Beijing 100080, China.
Bingzheng Wei Xiaomi Corporation, Beijing 100085, China.
Maode Lai Department of Pathology, School of Medicine, Zhejiang University, Hangzhou 310027, China.
Jing Liao Department of Computer Science, City University of Hong Kong, 999077, Hong Kong SAR, China.
Yubo Fan School of Biological Science and Medical Engineering, State Key Laboratory of Software Development Environment, Key Laboratory of Biomechanics, Mechanobiology of Ministry of Education and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100191, China.
Yan Xu School of Biological Science and Medical Engineering, State Key Laboratory of Software Development Environment, Key Laboratory of Biomechanics, Mechanobiology of Ministry of Education and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100191, China; Microsoft Research, Beijing 100080, China.

Collapse

Zhao T, Han J, Yang L, Zhang D. Equivalent Classification Mapping for Weakly Supervised Temporal Action Localization. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023;45:3019-3031. [PMID: 35635810 DOI: 10.1109/tpami.2022.3178957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Abstract

Weakly supervised temporal action localization is a newly emerging yet widely studied topic in recent years. The existing methods can be categorized into two localization-by-classification pipelines, i.e., the pre-classification pipeline and the post-classification pipeline. The pre-classification pipeline first performs classification on each video snippet, and then, aggregates the snippet-level classification scores to obtain the video-level classification score. In contrast, the post-classification pipeline aggregates the snippet-level features first and then predicts the video-level classification score based on the aggregated feature. Although the classifiers in these two pipelines are used in different ways, the role they play is exactly the same-to classify the given features to identify the corresponding action categories. To this end, an ideal classifier can make both pipelines work. This inspires us to simultaneously learn these two pipelines in a unified framework to obtain an effective classifier. Specifically, in the proposed learning framework, we implement two parallel network streams to model the two localization-by-classification pipelines simultaneously and make the two network streams share the same classifier. This achieves the novel Equivalent Classification Mapping (ECM) mechanism. Moreover, we discover that an ideal classifier may possess two characteristics: 1) the frame-level classification scores obtained from the pre-classification stream and the feature aggregation weights in the post-classification stream should be consistent; and 2) the classification results of these two streams should be identical. Based on these two characteristics, we further introduce a weight-transition module and an equivalent training strategy into the proposed learning framework, which assists to thoroughly mine the equivalence mechanism. Comprehensive experiments are conducted on three benchmarks and ECM achieves accurate action localization results.

Collapse

Yang W, Chen M, Wu H, Lin Z, Kong D, Xie S, Takamasu K. Deep learning-based weak micro-defect detection on an optical lens surface with micro vision. OPTICS EXPRESS 2023;31:5593-5608. [PMID: 36823835 DOI: 10.1364/oe.482389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 01/12/2023] [Indexed: 06/18/2023]

Kamath V, Renuka A. Deep Learning Based Object Detection for Resource Constrained Devices- Systematic Review, Future Trends and Challenges Ahead. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]

Cardoen B, Wong T, Alan P, Lee S, Matsubara JA, Nabi IR, Hamarneh G. SPECHT: Self-tuning Plausibility based object detection Enables quantification of Conflict in Heterogeneous multi-scale microscopy. PLoS One 2022;17:e0276726. [PMID: 36580473 PMCID: PMC9799313 DOI: 10.1371/journal.pone.0276726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 10/12/2022] [Indexed: 12/30/2022] Open

Yang L, Han J, Zhao T, Lin T, Zhang D, Chen J. Background-Click Supervision for Temporal Action Localization. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022;44:9814-9829. [PMID: 34855585 DOI: 10.1109/tpami.2021.3132058] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Zong G, Wei L, Guo S, Wang Y. A cascaded refined rgb-d salient object detection network based on the attention mechanism. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04186-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Weakly Supervised Object Detection with Symmetry Context. Symmetry (Basel) 2022. [DOI: 10.3390/sym14091832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Li Y, Xue Y, Li L, Zhang X, Qian X. Domain Adaptive Box-Supervised Instance Segmentation Network for Mitosis Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022;41:2469-2485. [PMID: 35389862 DOI: 10.1109/tmi.2022.3165518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Milani F, Pinciroli Vago NO, Fraternali P. Proposals Generation for Weakly Supervised Object Detection in Artwork Images. J Imaging 2022;8:215. [PMID: 36005458 PMCID: PMC9410216 DOI: 10.3390/jimaging8080215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 08/03/2022] [Accepted: 08/04/2022] [Indexed: 11/16/2022] Open

Object Localization in Weakly Labeled Remote Sensing Images Based on Deep Convolutional Features. REMOTE SENSING 2022. [DOI: 10.3390/rs14133230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Abstract Object recognition, as one of the most fundamental and challenging problems in high-resolution remote sensing image interpretation, has received increasing attention in recent years. However, most conventional object recognition pipelines aim to recognize instances with bounding boxes in a supervised learning strategy, which require intensive and manual labor for instance annotation creation. In this paper, we propose a weakly supervised learning method to alleviate this problem. The core idea of our method is to recognize multiple objects in an image using only image-level semantic labels and indicate the recognized objects with location points instead of box extent. Specifically, a deep convolutional neural network is first trained to perform semantic scene classification, of which the result is employed for the categorical determination of objects in an image. Then, by back-propagating the categorical feature from the fully connected layer to the deep convolutional layer, the categorical and spatial information of an image are combined to obtain an object discriminative localization map, which can effectively indicate the salient regions of objects. Next, a dynamic updating method of local response extremum is proposed to further determine the locations of objects in an image. Finally, extensive experiments are conducted to localize aircraft and oiltanks in remote sensing images based on different convolutional neural networks. Experimental results show that the proposed method outperforms the-state-of-the-art methods, achieving the precision, recall, and F1-score at 94.50%, 88.79%, and 91.56% for aircraft localization and 89.12%, 83.04%, and 85.97% for oiltank localization, respectively. We hope that our work could serve as a basic reference for remote sensing object localization via a weakly supervised strategy and provide new opportunities for further research. Collapse

Li H, Li Y, Jin Y, Wang T. Object representation enhancement for self‐supervised colocalization. INT J INTELL SYST 2022. [DOI: 10.1002/int.22938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Xu X, Sanford T, Turkbey B, Xu S, Wood BJ, Yan P. Shadow-Consistent Semi-Supervised Learning for Prostate Ultrasound Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022;41:1331-1345. [PMID: 34971530 PMCID: PMC9709821 DOI: 10.1109/tmi.2021.3139999] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Adke S, Li C, Rasheed KM, Maier FW. Supervised and Weakly Supervised Deep Learning for Segmentation and Counting of Cotton Bolls Using Proximal Imagery. SENSORS 2022;22:s22103688. [PMID: 35632096 PMCID: PMC9147286 DOI: 10.3390/s22103688] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 04/28/2022] [Accepted: 05/05/2022] [Indexed: 11/16/2022]

Abstract

The total boll count from a plant is one of the most important phenotypic traits for cotton breeding and is also an important factor for growers to estimate the final yield. With the recent advances in deep learning, many supervised learning approaches have been implemented to perform phenotypic trait measurement from images for various crops, but few studies have been conducted to count cotton bolls from field images. Supervised learning models require a vast number of annotated images for training, which has become a bottleneck for machine learning model development. The goal of this study is to develop both fully supervised and weakly supervised deep learning models to segment and count cotton bolls from proximal imagery. A total of 290 RGB images of cotton plants from both potted (indoor and outdoor) and in-field settings were taken by consumer-grade cameras and the raw images were divided into 4350 image tiles for further model training and testing. Two supervised models (Mask R-CNN and S-Count) and two weakly supervised approaches (WS-Count and CountSeg) were compared in terms of boll count accuracy and annotation costs. The results revealed that the weakly supervised counting approaches performed well with RMSE values of 1.826 and 1.284 for WS-Count and CountSeg, respectively, whereas the fully supervised models achieve RMSE values of 1.181 and 1.175 for S-Count and Mask R-CNN, respectively, when the number of bolls in an image patch is less than 10. In terms of data annotation costs, the weakly supervised approaches were at least 10 times more cost efficient than the supervised approach for boll counting. In the future, the deep learning models developed in this study can be extended to other plant organs, such as main stalks, nodes, and primary and secondary branches. Both the supervised and weakly supervised deep learning models for boll counting with low-cost RGB images can be used by cotton breeders, physiologists, and growers alike to improve crop breeding and yield estimation.

Collapse

RSMNet: A Regional Similar Module Network for Weakly Supervised Object Localization. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10849-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Jafari MH, Luong C, Tsang M, Gu AN, Van Woudenberg N, Rohling R, Tsang T, Abolmaesumi P. U-LanD: Uncertainty-Driven Video Landmark Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022;41:793-804. [PMID: 34705639 DOI: 10.1109/tmi.2021.3123547] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Liu P, Zheng G. Handling Imbalanced Data: Uncertainty-guided Virtual Adversarial Training with Batch Nuclear-norm Optimization for Semi-supervised Medical Image Classification. IEEE J Biomed Health Inform 2022;26:2983-2994. [PMID: 35344500 DOI: 10.1109/jbhi.2022.3162748] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

The Challenge of Data Annotation in Deep Learning—A Case Study on Whole Plant Corn Silage. SENSORS 2022;22:s22041596. [PMID: 35214497 PMCID: PMC8879292 DOI: 10.3390/s22041596] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/14/2022] [Accepted: 02/16/2022] [Indexed: 02/04/2023]

Liu Y, Wei YS, Yan H, Li GB, Lin L. Causal Reasoning Meets Visual Representation Learning: A Prospective Study. MACHINE INTELLIGENCE RESEARCH 2022;19:485-511. [PMCID: PMC9638478 DOI: 10.1007/s11633-022-1362-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 08/01/2022] [Indexed: 09/29/2023]

Abstract

Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing. Due to the emergence of huge amounts of multimodal heterogeneous spatial/temporal/spatial-temporal data in the big data era, the lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models. The majority of the existing methods tend to fit the original data/variable distributions and ignore the essential causal relations behind the multi-modal knowledge, which lacks unified guidance and analysis about why modern visual representation learning methods easily collapse into data bias and have limited generalization and cognitive abilities. Inspired by the strong inference ability of human-level agents, recent years have therefore witnessed great effort in developing causal reasoning paradigms to realize robust representation and model learning with good cognitive ability. In this paper, we conduct a comprehensive review of existing causal reasoning methods for visual representation learning, covering fundamental theories, models, and datasets. The limitations of current methods and datasets are also discussed. Moreover, we propose some prospective challenges, opportunities, and future research directions for benchmarking causal reasoning algorithms in visual representation learning. This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods, publicly available benchmarks, and consensus-building standards for reliable visual representation learning and related real-world applications more efficiently.

Collapse

Pinciroli Vago NO, Milani F, Fraternali P, da Silva Torres R. Comparing CAM Algorithms for the Identification of Salient Image Features in Iconography Artwork Analysis. J Imaging 2021;7:106. [PMID: 39080894 PMCID: PMC8321385 DOI: 10.3390/jimaging7070106] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/16/2021] [Accepted: 06/24/2021] [Indexed: 12/13/2022] Open

Abstract

Iconography studies the visual content of artworks by considering the themes portrayed in them and their representation. Computer Vision has been used to identify iconographic subjects in paintings and Convolutional Neural Networks enabled the effective classification of characters in Christian art paintings. However, it still has to be demonstrated if the classification results obtained by CNNs rely on the same iconographic properties that human experts exploit when studying iconography and if the architecture of a classifier trained on whole artwork images can be exploited to support the much harder task of object detection. A suitable approach for exposing the process of classification by neural models relies on Class Activation Maps, which emphasize the areas of an image contributing the most to the classification. This work compares state-of-the-art algorithms (CAM, Grad-CAM, Grad-CAM++, and Smooth Grad-CAM++) in terms of their capacity of identifying the iconographic attributes that determine the classification of characters in Christian art paintings. Quantitative and qualitative analyses show that Grad-CAM, Grad-CAM++, and Smooth Grad-CAM++ have similar performances while CAM has lower efficacy. Smooth Grad-CAM++ isolates multiple disconnected image regions that identify small iconographic symbols well. Grad-CAM produces wider and more contiguous areas that cover large iconographic symbols better. The salient image areas computed by the CAM algorithms have been used to estimate object-level bounding boxes and a quantitative analysis shows that the boxes estimated with Grad-CAM reach 55% average IoU, 61% GT-known localization and 31% mAP. The obtained results are a step towards the computer-aided study of the variations of iconographic elements positioning and mutual relations in artworks and open the way to the automatic creation of bounding boxes for training detectors of iconographic symbols in Christian art images.

Collapse