1
|
Hong Q, Liu W, Zhu Y, Ren T, Shi C, Lu Z, Yang Y, Deng R, Qian J, Tan C. CTHNet: a network for wheat ear counting with local-global features fusion based on hybrid architecture. FRONTIERS IN PLANT SCIENCE 2024; 15:1425131. [PMID: 39015290 PMCID: PMC11250278 DOI: 10.3389/fpls.2024.1425131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 06/18/2024] [Indexed: 07/18/2024]
Abstract
Accurate wheat ear counting is one of the key indicators for wheat phenotyping. Convolutional neural network (CNN) algorithms for counting wheat have evolved into sophisticated tools, however because of the limitations of sensory fields, CNN is unable to simulate global context information, which has an impact on counting performance. In this study, we present a hybrid attention network (CTHNet) for wheat ear counting from RGB images that combines local features and global context information. On the one hand, to extract multi-scale local features, a convolutional neural network is built using the Cross Stage Partial framework. On the other hand, to acquire better global context information, tokenized image patches from convolutional neural network feature maps are encoded as input sequences using Pyramid Pooling Transformer. Then, the feature fusion module merges the local features with the global context information to significantly enhance the feature representation. The Global Wheat Head Detection Dataset and Wheat Ear Detection Dataset are used to assess the proposed model. There were 3.40 and 5.21 average absolute errors, respectively. The performance of the proposed model was significantly better than previous studies.
Collapse
Affiliation(s)
- Qingqing Hong
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Wei Liu
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Yue Zhu
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Tianyu Ren
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Changrong Shi
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Zhixin Lu
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Yunqin Yang
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Ruiting Deng
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Jing Qian
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| | - Changwei Tan
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Key Laboratory of Crop Cultivation and Physiology, Agricultural College of Yangzhou University, Yangzhou, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China/Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou, China
| |
Collapse
|
2
|
Wang L, Zhou H, Xu N, Liu Y, Jiang X, Li S, Feng C, Xu H, Deng K, Song J. A general approach for automatic segmentation of pneumonia, pulmonary nodule, and tuberculosis in CT images. iScience 2023; 26:107005. [PMID: 37534183 PMCID: PMC10391673 DOI: 10.1016/j.isci.2023.107005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 04/27/2023] [Accepted: 05/26/2023] [Indexed: 08/04/2023] Open
Abstract
Proposing a general segmentation approach for lung lesions, including pulmonary nodules, pneumonia, and tuberculosis, in CT images will improve efficiency in radiology. However, the performance of generative adversarial networks is hampered by the limited availability of annotated samples and the catastrophic forgetting of the discriminator, whereas the universality of traditional morphology-based methods is insufficient for segmenting diverse lung lesions. A cascaded dual-attention network with a context-aware pyramid feature extraction module was designed to address these challenges. A self-supervised rotation loss was designed to mitigate discriminator forgetting. The proposed model achieved Dice coefficients of 70.92, 73.55, and 68.52% on multi-center pneumonia, lung nodule, and tuberculosis test datasets, respectively. No significant decrease in accuracy was observed (p > 0.10) when a small training sample size was used. The cyclic training of the discriminator was reduced with self-supervised rotation loss (p < 0.01). The proposed approach is promising for segmenting multiple lung lesion types in CT images.
Collapse
Affiliation(s)
- Lu Wang
- Department of Library, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, China
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - He Zhou
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - Nan Xu
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - Yuchan Liu
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China (USTC), Division of Life Sciences and Medicine, USTC Hefei, Anhui 230036, China
| | - Xiran Jiang
- School of Intelligent Medicine, China Medical University, Shenyang, Liaoning 110122, China
| | - Shu Li
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - Chaolu Feng
- Key Laboratory of Intelligent Computing in Medical Image (MIIC), Ministry of Education, Shenyang, Liaoning 110169, China
| | - Hainan Xu
- Department of Obstetrics and Gynecology, Pelvic Floor Disease Diagnosis and Treatment Center, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, China
| | - Kexue Deng
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China (USTC), Division of Life Sciences and Medicine, USTC Hefei, Anhui 230036, China
| | - Jiangdian Song
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| |
Collapse
|
3
|
Zhai W, Gao M, Li Q, Jeon G, Anisetti M. FPANet: feature pyramid attention network for crowd counting. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04499-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
|
4
|
Zhao Z, Li X. Deformable Density Estimation via Adaptive Representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; PP:1134-1144. [PMID: 37022433 DOI: 10.1109/tip.2023.3240839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Crowd counting is the basic task of crowd analysis and it is of great significance in the field of public safety. Therefore, it receives more and more attention recently. The common idea is to combine the crowd counting task with convolutional neural networks to predict the corresponding density map, which is generated by filtering the dot labels with specific Gaussian kernels. Although the counting performance is promoted by the newly proposed networks, they all suffer one conjunct problem, which is due to the perspective effect, there is significant scale contrast among targets in different positions within one scene, but the existing density maps can not represent this scale change well. To address the prediction difficulties caused by target scale variation, we propose a scale-sensitive crowd density map estimation framework, which focuses on dealing with target scale change from density map generation, network design, and model training stage. It consists of the Adaptive Density Map (ADM), Deformable Density Map Decoder (DDMD), and Auxiliary Branch. To be specific, the Gaussian kernel size variates adaptively based on target size to generate ADM that contains scale information for each specific target. DDMD introduces the deformable convolution to fit the Gaussian kernel variation and boosts the model's scale sensitivity. The Auxiliary Branch guides the learning of deformable convolution offsets during the training phase. Finally, we construct experiments on different large-scale datasets. The results show the effectiveness of the proposed ADM and DDMD. Furthermore, the visualization demonstrates that deformable convolution learns the target scale variation.
Collapse
|
5
|
Lyu L, Han R, Chen Z. Cascaded parallel crowd counting network with multi-resolution collaborative representation. APPL INTELL 2023; 53:3002-3016. [PMID: 35607431 PMCID: PMC9117858 DOI: 10.1007/s10489-022-03639-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/14/2022] [Indexed: 01/14/2023]
Abstract
Accurately estimating the size and density distribution of a crowd from images is of great importance to public safety and crowd management during the COVID-19 pandemic, but it is very challenging as it is affected by many complex factors, including perspective distortion and background noise information. In this paper, we propose a novel multi-resolution collaborative representation framework called the cascaded parallel network (CP-Net), consisting of three parallel scale-specific branches connected in a cascading mode. In the framework, the three cascaded multi-resolution branches efficiently capture multi-scale features through their specific receptive fields. Additionally, multi-level feature fusion and information filtering are performed continuously on each branch to resist noise interference and perspective distortion. Moreover, we design an information exchange module across independent branches to refine the features extracted by each specific branch and deal with perspective distortion by using complementary information of multiple resolutions. To further improve the robustness of the network to scale variance and generate high-quality density maps, we construct a multi-receptive field fusion module to aggregate multi-scale features more comprehensively. The performance of our proposed CP-Net is verified on the challenging counting datasets (UCF_CC_50, UCF-QNRF, Shanghai Tech A&B, and WorldExpo'10), and the experimental results demonstrate the superiority of the proposed method.
Collapse
Affiliation(s)
- Lei Lyu
- grid.410585.d0000 0001 0495 1805School of Information Science and Engineering, Shandong Normal University, Jinan, 250358 China ,Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan, 250358 China
| | - Run Han
- grid.410585.d0000 0001 0495 1805School of Information Science and Engineering, Shandong Normal University, Jinan, 250358 China ,Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan, 250358 China
| | - Ziming Chen
- Shandong Zhengzhong Information Technology Co., LTD, Jinan, 250014 China ,Shandong Digital Applied Science Research Institute Co.,LTD, Jinan, 250101 China
| |
Collapse
|
6
|
Peng S, Yin B, Yang Q, He Q, Wang L. Exploring density rectification and domain adaption method for crowd counting. Neural Comput Appl 2023; 35:3551-3569. [PMID: 36267471 PMCID: PMC9568950 DOI: 10.1007/s00521-022-07917-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 09/30/2022] [Indexed: 01/31/2023]
Abstract
Crowd counting has received increasing attention due to its important roles in multiple fields, such as social security, commercial applications, epidemic prevention and control. To this end, we explore two critical issues that seriously affect the performance of crowd counting including nonuniform crowd density distribution and cross-domain problems. Aiming at the nonuniform crowd density distribution issue, we propose a density rectifying network (DRNet) that consists of several dual-layer pyramid fusion modules (DPFM) and a density rectification map (DRmap) auxiliary learning module. The proposed DPFM is embedded into DRNet to integrate multi-scale crowd density features through dual-layer pyramid fusion. The devised DRmap auxiliary learning module further rectifies the incorrect crowd density estimation by adaptively weighting the initial crowd density maps. With respect to the cross-domain issue, we develop a domain adaptation method of randomly cutting mixed dual-domain images, which learns domain-invariance features and decreases the domain gap between the source domain and the target domain from global and local perspectives. Experimental results indicate that the devised DRNet achieves the best mean absolute error (MAE) and competitive mean squared error (MSE) compared with other excellent methods on four benchmark datasets. Additionally, a series of cross-domain experiments are conducted to demonstrate the effectiveness of the proposed domain adaption method. Significantly, when the A and B parts of the Shanghaitech dataset are the source domain and target domain respectively, the proposed domain adaption method decreases the MAE of DRNet by 47.6 % .
Collapse
Affiliation(s)
- Sifan Peng
- grid.59053.3a0000000121679639Department of Automation, University of Science and Technology of China, Huangshan Road, Hefei, 230027 Anhui China
| | - Baoqun Yin
- grid.59053.3a0000000121679639Department of Automation, University of Science and Technology of China, Huangshan Road, Hefei, 230027 Anhui China
| | - Qianqian Yang
- grid.59053.3a0000000121679639Department of Automation, University of Science and Technology of China, Huangshan Road, Hefei, 230027 Anhui China
| | - Qing He
- grid.59053.3a0000000121679639Department of Automation, University of Science and Technology of China, Huangshan Road, Hefei, 230027 Anhui China
| | - Luyang Wang
- grid.59053.3a0000000121679639Department of Automation, University of Science and Technology of China, Huangshan Road, Hefei, 230027 Anhui China
| |
Collapse
|
7
|
Abozeid A, I. Taloba A, M. Abd El-Aziz R, Faiz Alwaghid A, Salem M, Elhadad A. An Efficient Indoor Localization Based on Deep Attention Learning Model. COMPUTER SYSTEMS SCIENCE AND ENGINEERING 2023; 46:2637-2650. [DOI: 10.32604/csse.2023.037761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 01/06/2023] [Indexed: 09/02/2023]
|
8
|
Huang K, Zhang Y, Cheng HD, Xing P. Trustworthy Breast Ultrasound Image Semantic Segmentation Based on Fuzzy Uncertainty Reduction. Healthcare (Basel) 2022; 10:healthcare10122480. [PMID: 36554005 PMCID: PMC9778351 DOI: 10.3390/healthcare10122480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/02/2022] [Accepted: 12/06/2022] [Indexed: 12/14/2022] Open
Abstract
Medical image semantic segmentation is essential in computer-aided diagnosis systems. It can separate tissues and lesions in the image and provide valuable information to radiologists and doctors. The breast ultrasound (BUS) images have advantages: no radiation, low cost, portable, etc. However, there are two unfavorable characteristics: (1) the dataset size is often small due to the difficulty in obtaining the ground truths, and (2) BUS images are usually in poor quality. Trustworthy BUS image segmentation is urgent in breast cancer computer-aided diagnosis systems, especially for fully understanding the BUS images and segmenting the breast anatomy, which supports breast cancer risk assessment. The main challenge for this task is uncertainty in both pixels and channels of the BUS images. In this paper, we propose a Spatial and Channel-wise Fuzzy Uncertainty Reduction Network (SCFURNet) for BUS image semantic segmentation. The proposed architecture can reduce the uncertainty in the original segmentation frameworks. We apply the proposed method to four datasets: (1) a five-category BUS image dataset with 325 images, and (2) three BUS image datasets with only tumor category (1830 images in total). The proposed approach compares state-of-the-art methods such as U-Net with VGG-16, ResNet-50/ResNet-101, Deeplab, FCN-8s, PSPNet, U-Net with information extension, attention U-Net, and U-Net with the self-attention mechanism. It achieves 2.03%, 1.84%, and 2.88% improvements in the Jaccard index on three public BUS datasets, and 6.72% improvement in the tumor category and 4.32% improvement in the overall performance on the five-category dataset compared with that of the original U-shape network with ResNet-101 since it can handle the uncertainty effectively and efficiently.
Collapse
Affiliation(s)
- Kuan Huang
- Department of Computer Science and Technology, Kean University, Union, NJ 07083, USA
| | - Yingtao Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Heng-Da Cheng
- Department of Computer Science, Utah State University, Logan, UT 84322, USA
- Correspondence:
| | - Ping Xing
- Ultrasound Department, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| |
Collapse
|
9
|
Zheng Z, Ni N, Xie G, Zhu A, Wu Y, Yang T. HARNet: Hierarchical adaptive regression with location recovery for crowd counting. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
10
|
Bai H, Mao J, Gary Chan SH. A survey on deep learning-based single image crowd counting: Network design, loss function and supervisory signal. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Guo X, Gao M, Zhai W, Shang J, Li Q. Spatial-Frequency Attention Network for Crowd Counting. BIG DATA 2022; 10:453-465. [PMID: 35679590 DOI: 10.1089/big.2022.0039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Counting the number of people in crowded scenarios is a crucial task in video surveillance and urban security system. Widely deployed surveillance cameras provide big data for training, a compelling deep learning-based counting network. However, large-scale variations in dense crowds are still not entirely solved. To address this problem, we propose a spatial-frequency attention network (SFANet) for crowd counting in this article. A bottleneck spatial attention module is built to emphasize features in various spatial locations and select a region containing individuals adaptively in the spatial domain. As a complementary, in the frequency domain, a multispectral channel attention module is adopted to obtain a more complete set of frequency components for representing each channel. The two attention modules are combined to focus on the discriminative region and suppress the misleading information by their mutual promotion. Experimental results on five benchmark crowd data sets demonstrate that the SFANet can achieve the state-of-the-art performance in terms of accuracy and robustness.
Collapse
Affiliation(s)
- Xiangyu Guo
- School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, China
| | - Mingliang Gao
- School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, China
| | - Wenzhe Zhai
- School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, China
| | - Jianrun Shang
- School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, China
| | - Qilei Li
- School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom
| |
Collapse
|
12
|
Meng C, Kang C, Lyu L. Hierarchical feature aggregation network with semantic attention for counting large‐scale crowd. INT J INTELL SYST 2022. [DOI: 10.1002/int.23023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Chen Meng
- School of Information Science and Engineering Shandong Normal University Jinan Shandong China
| | - Chunmeng Kang
- School of Information Science and Engineering Shandong Normal University Jinan Shandong China
- Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology Jinan Shandong China
| | - Lei Lyu
- School of Information Science and Engineering Shandong Normal University Jinan Shandong China
- Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology Jinan Shandong China
| |
Collapse
|
13
|
Liu Y, Wang Z, Shi M, Satoh S, Zhao Q, Yang H. Discovering regression-detection bi-knowledge transfer for unsupervised cross-domain crowd counting. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
14
|
Liang L, Zhao H, Zhou F, Zhang Q, Song Z, Shi Q. SC2Net: Scale-aware Crowd Counting Network with Pyramid Dilated Convolution. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03648-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
15
|
Qaiser T, Lee CY, Vandenberghe M, Yeh J, Gavrielides MA, Hipp J, Scott M, Reischl J. Usability of deep learning and H&E images predict disease outcome-emerging tool to optimize clinical trials. NPJ Precis Oncol 2022; 6:37. [PMID: 35705792 PMCID: PMC9200764 DOI: 10.1038/s41698-022-00275-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 04/27/2022] [Indexed: 11/24/2022] Open
Abstract
Understanding factors that impact prognosis for cancer patients have high clinical relevance for treatment decisions and monitoring of the disease outcome. Advances in artificial intelligence (AI) and digital pathology offer an exciting opportunity to capitalize on the use of whole slide images (WSIs) of hematoxylin and eosin (H&E) stained tumor tissue for objective prognosis and prediction of response to targeted therapies. AI models often require hand-delineated annotations for effective training which may not be readily available for larger data sets. In this study, we investigated whether AI models can be trained without region-level annotations and solely on patient-level survival data. We present a weakly supervised survival convolutional neural network (WSS-CNN) approach equipped with a visual attention mechanism for predicting overall survival. The inclusion of visual attention provides insights into regions of the tumor microenvironment with the pathological interpretation which may improve our understanding of the disease pathomechanism. We performed this analysis on two independent, multi-center patient data sets of lung (which is publicly available data) and bladder urothelial carcinoma. We perform univariable and multivariable analysis and show that WSS-CNN features are prognostic of overall survival in both tumor indications. The presented results highlight the significance of computational pathology algorithms for predicting prognosis using H&E stained images alone and underpin the use of computational methods to improve the efficiency of clinical trial studies.
Collapse
Affiliation(s)
- Talha Qaiser
- Precision Medicine and Biosamples, Oncology R&D, AstraZeneca, Cambridge, UK.
| | | | | | - Joe Yeh
- AetherAI, Taipei City, Taiwan
| | | | - Jason Hipp
- Early Oncology, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Marietta Scott
- Precision Medicine and Biosamples, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Joachim Reischl
- Precision Medicine and Biosamples, Oncology R&D, AstraZeneca, Cambridge, UK
| |
Collapse
|
16
|
Wang Q, Lin W, Gao J, Li X. Density-Aware Curriculum Learning for Crowd Counting. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:4675-4687. [PMID: 33259316 DOI: 10.1109/tcyb.2020.3033428] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recently, crowd counting draws much attention on account of its significant meaning in congestion control, public safety, and ecological surveys. Although the performance is improved dramatically due to the development of deep learning, the scales of these networks also become larger and more complex. Moreover, a large model also entails more time to train for better performance. To tackle these problems, this article first constructs a lightweight model, which is composed of an image feature encoder and a simple but effective decoder, called the pixel shuffle decoder (PSD). PSD ends with a pixel shuffle operator, which can display more density information without increasing the number of convolutional layers. Second, a density-aware curriculum learning (DCL) training strategy is designed to fully tap the potential of crowd counting models. DCL gives each predicted pixel a weight to determine its predicting difficulty and provides guidance on obtaining better generalization. Experimental results exhibit that PSD can achieve outstanding performance on most mainstream datasets while training under the DCL training framework. Besides, we also conduct some experiments about adopting DCL on existing typical crowd counters, and the results show that they all obtain new better performance than before, which further validates the effectiveness of our method.
Collapse
|
17
|
Khaki S, Safaei N, Pham H, Wang L. WheatNet: A lightweight convolutional neural network for high-throughput image-based wheat head detection and counting. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.03.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
18
|
Wang F, Sang J, Wu Z, Liu Q, Sang N. Hybrid attention network based on progressive embedding scale-context for crowd counting. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.01.046] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
19
|
Chen Y, Yang J, Zhang D, Zhang K, Chen B, Du S. Region-aware network: Model human’s Top-Down visual perception mechanism for crowd counting. Neural Netw 2022; 148:219-231. [DOI: 10.1016/j.neunet.2022.01.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 11/16/2022]
|
20
|
Scale and Background Aware Asymmetric Bilateral Network for Unconstrained Image Crowd Counting. MATHEMATICS 2022. [DOI: 10.3390/math10071053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
This paper attacks the two challenging problems of image-based crowd counting, that is, scale variation and complex background. To that end, we present a novel crowd counting method, called the Scale and Background aware Asymmetric Bilateral Network (SBAB-Net), which is able to handle scale variation and background noise in a unified framework. Specifically, the proposed SBAB-Net contains three main components, a pre-trained backbone convolutional neural network (CNN) as the feature extractor and two asymmetric branches to generate a density map. These two asymmetric branches have different structures and use features from different semantic layers. One branch is densely connected stacked dilated convolution (DCSDC) sub-network with different dilation rates, which relies on one deep feature layer and can handle scale variation. The other branch is parameter-free densely connected stacked pooling (DCSP) sub-network with various pooling kernels and strides, which relies on shallow feature and can fuse features with several receptive fields to reduce the impact of background noise. Two sub-networks are fused by the attention mechanism to generate the final density map. Extensive experimental results on three widely-used benchmark datasets have demonstrated the effectiveness and superiority of our proposed method: (1) We achieve competitive counting performance compared to state-of-the-art methods; (2) Compared with baseline, the MAE and MSE are decreased by at least 6.3% and 11.3%, respectively.
Collapse
|
21
|
Zhou F, Zhao H, Zhang Y, Zhang Q, Liang L, Li Y, Duan Z. COMAL: compositional multi-scale feature enhanced learning for crowd counting. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 81:20541-20560. [PMID: 35291715 PMCID: PMC8914450 DOI: 10.1007/s11042-022-12249-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Revised: 06/11/2021] [Accepted: 01/14/2022] [Indexed: 06/14/2023]
Abstract
Accurately modeling the crowd's head scale variations is an effective way to improve the counting accuracy of the crowd counting methods. Most counting networks apply a multi-branch network structure to obtain different scales of head features. Although they have achieved promising results, they do not perform very well on the extreme scale variation scene due to the limited scale representability. Meanwhile, these methods are prone to recognize background objects as foreground crowds in complex scenes due to the limited context and high-level semantic information. We propose a compositional multi-scale feature enhanced learning approach (COMAL) for crowd counting to handle the above limitations. COMAL enhances the multi-scale feature representations from three aspects: (1) The semantic enhanced module (SEM) is developed for embedding the high-level semantic information to the multi-scale features; (2) The diversity enhanced module (DEM) is proposed to enrich the variety of crowd features' different scales; (3) The context enhanced module (CEM) is designed for strengthening the multi-scale features with more context information. Based on the proposed COMAL, we develop a crowd counting network under the encoder-decoder framework and perform extensive experiments on ShanghaiTech, UCF_CC_50, and UCF-QNRF datasets. Qualitative and quantitive results demonstrate the effectiveness of the proposed COMAL.
Collapse
Affiliation(s)
- Fangbo Zhou
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Huailin Zhao
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Yani Zhang
- School of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Qing Zhang
- School of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Lanjun Liang
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Yaoyao Li
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Zuodong Duan
- Science and Technology on Electromechanical Dynamic Control Laboratory, School of Mechatronical Engineering, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
22
|
Towards real-time object detection in GigaPixel-level video. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.12.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
23
|
Xie J, Gu L, Li Z, Lyu L. HRANet: Hierarchical region-aware network for crowd counting. APPL INTELL 2022; 52:12191-12205. [PMID: 35125656 PMCID: PMC8807383 DOI: 10.1007/s10489-021-03030-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/23/2021] [Indexed: 11/18/2022]
Abstract
Aiming to tackle the most intractable problems of scale variation and complex backgrounds in crowd counting, we present an innovative framework called Hierarchical Region-Aware Network (HRANet) for crowd counting in this paper, which can better focus on crowd regions to accurately predict crowd density. In our implementation, first, we design a Region-Aware Module (RAM) to capture the internal differences within different regions of the feature map, thus adaptively extracting contextual features within different regions. Furthermore, we propose a Region Recalibration Module (RRM) which adopts a novel region-aware attention mechanism (RAAM) to further recalibrate the feature weights of different regions. By the integration of the above two modules, the influence of background regions can be effectively suppressed. Besides, considering the local correlations within different regions of the crowd density map, a Region Awareness Loss (RAL) is designed to reduce false identification while producing the locally consistent density map. Extensive experiments on five challenging datasets demonstrate that the proposed method significantly outperforms existing methods in terms of counting accuracy and quality of the generated density map. In addition, a series of specific experiments in crowd gathering scenes indicate that our method can be practically applied to crowd localization.
Collapse
|
24
|
He Y, Xia Y, Wang Y, Yin B. Jointly Attention Network for Crowd Counting. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.02.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
25
|
Liu Q, Guo Y, Sang J, Tan J, Wang F, Tian S. SGCNet: Scale-aware and global contextual network for crowd counting. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03230-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
26
|
|
27
|
Xia Y, He Y, Peng S, Hao X, Yang Q, Yin B. EDENet: Elaborate density estimation network for crowd counting. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
28
|
Gao J, Yuan Y, Wang Q. Feature-Aware Adaptation and Density Alignment for Crowd Counting in Video Surveillance. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4822-4833. [PMID: 33259318 DOI: 10.1109/tcyb.2020.3034316] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
With the development of deep neural networks, the performance of crowd counting and pixel-wise density estimation is continually being refreshed. Despite this, there are still two challenging problems in this field: 1) current supervised learning needs a large amount of training data, but collecting and annotating them is difficult and 2) existing methods cannot generalize well to the unseen domain. A recently released synthetic crowd dataset alleviates these two problems. However, the domain gap between the real-world data and synthetic images decreases the models' performance. To reduce the gap, in this article, we propose a domain-adaptation-style crowd counting method, which can effectively adapt the model from synthetic data to the specific real-world scenes. It consists of multilevel feature-aware adaptation (MFA) and structured density map alignment (SDA). To be specific, MFA boosts the model to extract domain-invariant features from multiple layers. SDA guarantees the network outputs fine density maps with a reasonable distribution on the real domain. Finally, we evaluate the proposed method on four mainstream surveillance crowd datasets, Shanghai Tech Part B, WorldExpo'10, Mall, and UCSD. Extensive experiments are evidence that our approach outperforms the state-of-the-art methods for the same cross-domain counting problem.
Collapse
|
29
|
Spatial Pattern Evaluation of Rural Tourism via the Multifactor-Weighted Neural Network Model in the Big Data Era. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:5845545. [PMID: 34497638 PMCID: PMC8419508 DOI: 10.1155/2021/5845545] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Accepted: 08/17/2021] [Indexed: 11/17/2022]
Abstract
The exploration of the evaluation effect of rural tourism spatial pattern based on the multifactor-weighted neural network model in the era of big data aims to optimize the spatial layout of rural tourist attractions. There are plenty of problems such as improper site selection, layout dispersion, and market competition disorder of rural tourism caused by insufficient consideration of planning and tourist market. Hence, the multifactor model after simple weighting is combined with the neural network to construct a spatiotemporal convolution neural network model based on multifactor weighting here to solve these problems. Moreover, the simulation experiment is conducted on the spatial pattern of rural tourism in the Ningxia Hui Autonomous Region to verify the evaluation performance of the constructed model. The results show that the prediction accuracy of the model is 97.69%, which is at least 2.13% higher than that of the deep learning algorithm used by other scholars. Through the evaluation and analysis of the spatial pattern of rural tourist attractions, the spatial distribution of scenic spots in Ningxia has strong stability from 2009 to 2019. Meanwhile, the number of scenic spots in the seven plates has increased and the time cost of scenic spot accessibility has changed significantly. Besides, the change rate of the one-hour isochronous cycle reaches 41.67%. This indicates that the neural network model has high prediction accuracy in evaluating the spatial pattern of rural tourist attractions, which can provide experimental reference for the digital development of the spatial pattern of rural tourism.
Collapse
|
30
|
Gu L, Pang C, Zheng Y, Lyu C, Lyu L. Context-aware pyramid attention network for crowd counting. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02639-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
31
|
Peng S, Yin B, Hao X, Yang Q, Kumar A, Wang L. Depth and edge auxiliary learning for still image crowd density estimation. Pattern Anal Appl 2021. [DOI: 10.1007/s10044-021-01017-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
32
|
|
33
|
Wang Q, Gao J, Lin W, Li X. NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:2141-2149. [PMID: 32750840 DOI: 10.1109/tpami.2020.3013269] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In the last decade, crowd counting and localization attract much attention of researchers due to its wide-spread applications, including crowd monitoring, public safety, space design, etc. Many convolutional neural networks (CNN) are designed for tackling this task. However, currently released datasets are so small-scale that they can not meet the needs of the supervised CNN-based algorithms. To remedy this problem, we construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes. Compared with other real-world datasets, it contains various illumination scenes and has the largest density range ( 0 ∼ 20,033). Besides, a benchmark website is developed for impartially evaluating the different methods, which allows researchers to submit the results of the test set. Based on the proposed dataset, we further describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data. What's more, the benchmark is deployed at https://www.crowdbenchmark.com/, and the dataset/code/models/results are available at https://gjy3035.github.io/NWPU-Crowd-Sample-Code/.
Collapse
|
34
|
Congested Crowd Counting via Adaptive Multi-Scale Context Learning. SENSORS 2021; 21:s21113777. [PMID: 34072408 PMCID: PMC8198824 DOI: 10.3390/s21113777] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 05/23/2021] [Accepted: 05/27/2021] [Indexed: 12/14/2022]
Abstract
In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.
Collapse
|
35
|
Xiao L, Li C, Wang Y, Si W, Zhang D, Lin H, Cai X, Heng PA. Automatic identification of sweet spots from MERs for electrodes implantation in STN-DBS. Int J Comput Assist Radiol Surg 2021; 16:809-818. [PMID: 33907990 DOI: 10.1007/s11548-021-02377-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 04/12/2021] [Indexed: 11/28/2022]
Abstract
PURPOSE Microelectrode recordings (MERs) are a significant clinical indicator for sweet spots identification of implanted electrodes during deep brain stimulation of the subthalamic nucleus (STN) surgery. As 1D MERs signals have the unboundedness, large-range, large-amount and time-dependent characteristics, the purpose of this study is to propose an automatic and precise identification method of sweet spots from MERs, reducing the time-consuming and labor-intensive human annotations. METHODS We propose an automatic identification method of sweet spots from MERs for electrodes implantation in STN-DBS. To better imitate the surgeons' observation and obtain more intuitive contextual information, we first employ the 2D Gramian angular summation field (GASF) images generated from MERs data to perform the sweet spots determination for electrodes implantation. Then, we introduce the convolutional block attention module into convolutional neural network (CNN) to identify the 2D GASF images of sweet spots for electrodes implantation. RESULTS Experimental results illustrate that the identification result of our method is consistent with the result of doctor's decision, while our method can achieve the accuracy and precision of 96.72% and 98.97%, respectively, which outperforms state-of-the-art for intraoperative sweet spots determination. CONCLUSIONS The proposed method is the first time to automatically and accurately identify sweet spots from MERs for electrodes implantation by the combination an advanced time series-to-image encoding way with CBAM-enhanced networks model. Our method can assist neurosurgeons in automatically detecting the most likely locations of sweet spots for electrodes implantation, which can provide an important indicator for target selection while it reduces the localization error of the target during STN-DBS surgery.
Collapse
Affiliation(s)
- Linxia Xiao
- College of Control Science and Engineering, China University of Petroleum (East China), Qingdao, 266580, China
| | - Caizi Li
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yanjiang Wang
- College of Control Science and Engineering, China University of Petroleum (East China), Qingdao, 266580, China
| | - Weixin Si
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Doudou Zhang
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518037, China.,Shenzhen University School of Medicine, Shenzhen, 518061, China
| | - Hai Lin
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518037, China.,Shenzhen University School of Medicine, Shenzhen, 518061, China
| | - Xiaodong Cai
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518037, China.,Shenzhen University School of Medicine, Shenzhen, 518061, China
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, China.,Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| |
Collapse
|
36
|
Khaki S, Pham H, Han Y, Kuhl A, Kent W, Wang L. DeepCorn: A semi-supervised deep learning method for high-throughput image-based corn kernel counting and yield estimation. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106874] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
37
|
|
38
|
Xue Y, Li Y, Liu S, Zhang X, Qian X. Crowd Scene Analysis Encounters High Density and Scale Variation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2745-2757. [PMID: 33502976 DOI: 10.1109/tip.2021.3049963] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Crowd scene analysis receives growing attention due to its wide applications. Grasping the accurate crowd location is important for identifying high-risk regions. In this article, we propose a Compressed Sensing based Output Encoding (CSOE) scheme, which casts detecting pixel coordinates of small objects into a task of signal regression in encoding signal space. To prevent gradient vanishing, we derive our own sparse reconstruction backpropagation rule that is adaptive to distinct implementations of sparse reconstruction and makes the whole model end-to-end trainable. With the support of CSOE and the backpropagation rule, the proposed method shows more robustness to deep model training error, which is especially harmful to crowd counting and localization. The proposed method achieves state-of-the-art performance across four mainstream datasets, especially achieves excellent results in highly crowded scenes. A series of analysis and experiments support our claim that regression in CSOE space is better than traditionally detecting coordinates of small objects in pixel space for highly crowded scenes.
Collapse
|
39
|
Liang D, Yi B. Two-stage three-way enhanced technique for ensemble learning in inclusive policy text classification. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.08.051] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
40
|
Wang J, Chen P, Zheng N, Chen B, Principe JC, Wang FY. Associations between MSE and SSIM as cost functions in linear decomposition with application to bit allocation for sparse coding. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.10.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
41
|
Zhang K, Wang H, Liu W, Li M, Lu J, Liu Z. An efficient semi-supervised manifold embedding for crowd counting. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106634] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
42
|
Yuan L, Qiu Z, Liu L, Wu H, Chen T, Chen P, Lin L. Crowd counting via scale-communicative aggregation networks. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.05.042] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
43
|
|
44
|
Wang Y, Zhang W, Liu Y, Zhu J. Two-branch fusion network with attention map for crowd counting. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
45
|
Wang S, Lu Y, Zhou T, Di H, Lu L, Zhang L. SCLNet: Spatial context learning network for congested crowd counting. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.139] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
46
|
|
47
|
Wu X, Zheng Y, Ye H, Hu W, Ma T, Yang J, He L. Counting crowds with varying densities via adaptive scenario discovery framework. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.02.045] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
48
|
Multi-level feature fusion based Locality-Constrained Spatial Transformer network for video crowd counting. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.087] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|