51
|
Ban T, Usui T, Yamamoto T. Spatial Autoregressive Model for Estimation of Visitors' Dynamic Agglomeration Patterns Near Event Location. SENSORS 2021; 21:s21134577. [PMID: 34283103 PMCID: PMC8271624 DOI: 10.3390/s21134577] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 06/20/2021] [Accepted: 06/29/2021] [Indexed: 11/20/2022]
Abstract
The rapid development of ubiquitous mobile computing has enabled the collection of new types of massive traffic data to understand collective movement patterns in social spaces. Contributing to the understanding of crowd formation and dispersal in populated areas, we developed a model of visitors’ dynamic agglomeration patterns at a particular event using dynamic population data. This information, a type of big data, comprised aggregate Global Positioning System (GPS) location data automatically collected from mobile phones without users’ intervention over a grid with a spatial resolution of 250 m. Herein, spatial autoregressive models with two-step adjacency matrices are proposed to represent visitors’ movement between grids around the event site. We confirmed that the proposed models had a higher goodness-of-fit than those without spatial or temporal autocorrelations. The results also show a significant reduction in accuracy when applied to prediction with estimated values of the endogenous variables of prior time periods.
Collapse
Affiliation(s)
- Takumi Ban
- Department of Civil Engineering, Graduate School of Engineering, Nagoya University, Nagoya 464-8603, Japan
- Correspondence: (T.B.); (T.Y.)
| | - Tomotaka Usui
- Faculty of Human Environments, University of Human Environments, Okazaki 444-3505, Japan;
| | - Toshiyuki Yamamoto
- Institute of Materials and Systems for Sustainability, Nagoya University, Nagoya 464-8603, Japan
- Correspondence: (T.B.); (T.Y.)
| |
Collapse
|
52
|
Khan MA, Mittal M, Goyal LM, Roy S. A deep survey on supervised learning based human detection and activity classification methods. MULTIMEDIA TOOLS AND APPLICATIONS 2021; 80:27867-27923. [DOI: 10.1007/s11042-021-10811-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 03/03/2021] [Accepted: 03/10/2021] [Indexed: 08/25/2024]
|
53
|
Bendali-Braham M, Weber J, Forestier G, Idoumghar L, Muller PA. Recent trends in crowd analysis: A review. MACHINE LEARNING WITH APPLICATIONS 2021. [DOI: 10.1016/j.mlwa.2021.100023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
|
54
|
Hobbs J, Khachatryan V, Anandan BS, Hovhannisyan H, Wilson D. Broad Dataset and Methods for Counting and Localization of On-Ear Corn Kernels. Front Robot AI 2021; 8:627009. [PMID: 34109221 PMCID: PMC8183680 DOI: 10.3389/frobt.2021.627009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Accepted: 03/18/2021] [Indexed: 11/13/2022] Open
Abstract
Crop monitoring and yield prediction are central to management decisions for farmers. One key task is counting the number of kernels on an ear of corn to estimate yield in a field. As ears of corn can easily have 400-900 kernels, manual counting is unrealistic; traditionally, growers have approximated the number of kernels on an ear of corn through a mixture of counting and estimation. With the success of deep learning, these human estimates can now be replaced with more accurate machine learning models, many of which are efficient enough to run on a mobile device. Although a conceptually simple task, the counting and localization of hundreds of instances in an image is challenging for many image detection algorithms which struggle when objects are small in size and large in number. We compare different detection-based frameworks, Faster R-CNN, YOLO, and density-estimation approaches for on-ear corn kernel counting and localization. In addition to the YOLOv5 model which is accurate and edge-deployable, our density-estimation approach produces high-quality results, is lightweight enough for edge deployment, and maintains its computational efficiency independent of the number of kernels in the image. Additionally, we seek to standardize and broaden this line of work through the release of a challenging dataset with high-quality, multi-class segmentation masks. This dataset firstly enables quantitative comparison of approaches within the kernel counting application space and secondly promotes further research in transfer learning and domain adaptation, large count segmentation methods, and edge deployment methods.
Collapse
Affiliation(s)
| | | | - Barathwaj S. Anandan
- Intelinair, Inc., Champaign, IL, United States
- The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| | | | | |
Collapse
|
55
|
Abstract
Owing to the increased use of urban rail transit, the flow of passengers on metro platforms tends to increase sharply during peak periods. Monitoring passenger flow in such areas is important for security-related reasons. In this paper, in order to solve the problem of metro platform passenger flow detection, we propose a CNN (convolutional neural network)-based network called the MP (metro platform)-CNN to accurately count people on metro platforms. The proposed method is composed of three major components: a group of convolutional neural networks is used on the front end to extract image features, a multiscale feature extraction module is used to enhance multiscale features, and transposed convolution is used for upsampling to generate a high-quality density map. Currently, existing crowd-counting datasets do not adequately cover all of the challenging situations considered in this study. Therefore, we collected images from surveillance videos of a metro platform to form a dataset containing 627 images, with 9243 annotated heads. The results of the extensive experiments showed that our method performed well on the self-built dataset and the estimation error was minimum. Moreover, the proposed method could compete with other methods on four standard crowd-counting datasets.
Collapse
|
56
|
Liu W, Wang H, Luo H, Zhang K, Lu J, Xiong Z. Pseudo-label growth dictionary pair learning for crowd counting. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02274-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
57
|
Object Detection in Drone Imagery via Sample Balance Strategies and Local Feature Enhancement. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11083547] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With the advent of drones, new potential applications have emerged for the unconstrained analysis of images and videos from aerial view cameras. Despite the tremendous success of the generic object detection methods developed using ground-based photos, a considerable performance drop is observed when these same methods are directly applied to images captured by Unmanned Aerial Vehicles (UAVs). Usually, most of the work goes into improving the performance of the detector in aspects such as design loss, training sample selection, feature enhancement, and so forth. This paper proposes a detection framework based on an anchor-free detector with several modules, including a sample balance strategies module and super-resolved generated feature module, to improve performance. We proposed the sample balance strategies module to optimize the imbalance among training samples, especially the imbalance between positive and negative, and easy and hard samples. Due to the high frequencies and noisy representation of the small objects in images captured by drones, the detection task is extraordinarily challenging. However, when compared with other algorithms of this kind, our method achieves better results. We also propose a super-resolved generated GAN (Generative Adversarial Network) module with center-ness weights to effectively enhance the local feature map. Finally, we demonstrate our method’s effectiveness with the proposed modules by carrying out a state-of-the-art performance on Visdrone2020 benchmarks.
Collapse
|
58
|
Minoura H, Yonetani R, Nishimura M, Ushiku Y. Crowd Density Forecasting by Modeling Patch-Based Dynamics. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2020.3043169] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
59
|
|
60
|
Wang Y, Hou J, Hou X, Chau LP. A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2876-2887. [PMID: 33539297 DOI: 10.1109/tip.2021.3055632] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, we propose a novel self-training approach named Crowd-SDNet that enables a typical object detector trained only with point-level annotations (i.e., objects are labeled with points) to estimate both the center points and sizes of crowded objects. Specifically, during training, we utilize the available point annotations to supervise the estimation of the center points of objects directly. Based on a locally-uniform distribution assumption, we initialize pseudo object sizes from the point-level supervisory information, which are then leveraged to guide the regression of object sizes via a crowdedness-aware loss. Meanwhile, we propose a confidence and order-aware refinement scheme to continuously refine the initial pseudo object sizes such that the ability of the detector is increasingly boosted to detect and count objects in crowds simultaneously. Moreover, to address extremely crowded scenes, we propose an effective decoding method to improve the detector's representation ability. Experimental results on the WiderFace benchmark show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks, i.e., our method improves the average precision by more than 10% and reduces the counting error by 31.2%. Besides, our method obtains the best results on the crowd counting and localization datasets (i.e., ShanghaiTech and NWPU-Crowd) and vehicle counting datasets (i.e., CARPK and PUCPR+) compared with state-of-the-art counting-by-detection methods. The code will be publicly available at https://github.com/WangyiNTU/Point-supervised-crowd-detection.
Collapse
|
61
|
Hobbs J, Prakash P, Paull R, Hovhannisyan H, Markowicz B, Rose G. Large-Scale Counting and Localization of Pineapple Inflorescence Through Deep Density-Estimation. FRONTIERS IN PLANT SCIENCE 2021; 11:599705. [PMID: 33584745 PMCID: PMC7876329 DOI: 10.3389/fpls.2020.599705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 12/09/2020] [Indexed: 06/12/2023]
Abstract
Natural flowering affects fruit development and quality, and impacts the harvest of specialty plants like pineapple. Pineapple growers use chemicals to induce flowering so that most plants within a field produce fruit of high quality that is ready to harvest at the same time. Since pineapple is hand-harvested, the ability to harvest all of the fruit of a field in a single pass is critical to reduce field losses, costs, and waste, and to maximize efficiency. Traditionally, due to high planting densities, pineapple growers have been limited to gathering crop intelligence through manual inspection around the edges of the field, giving them only a limited view of their crop's status. Through the advances in remote sensing and computer vision, we can enable the regular inspection of the field and automated inflorescence counting enabling growers to optimize their management practices. Our work uses a deep learning-based density estimation approach to count the number of flowering pineapple plants in a field with a test MAE of 11.5 and MAPD of 6.37%. Notably, the computational complexity of this method does not depend on the number of plants present and therefore efficiently scale to easily detect over a 1.6 million flowering plants in a field. We further embed this approach in an active learning framework for continual learning and model improvement.
Collapse
Affiliation(s)
| | - Prajwal Prakash
- IntelinAir, Inc., Champaign, IL, United States
- Department of Electrical Engineering, Columbia University, New York, NY, United States
| | - Robert Paull
- Department of Tropical Plant and Soil Sciences, University of Hawaii at Manoa, Honolulu, HI, United States
| | | | | | - Greg Rose
- IntelinAir, Inc., Champaign, IL, United States
| |
Collapse
|
62
|
Chen TZ, Chen YY, Lai JH. Estimating Bus Cross-Sectional Flow Based on Machine Learning Algorithm Combined with Wi-Fi Probe Technology. SENSORS 2021; 21:s21030844. [PMID: 33513884 PMCID: PMC7865710 DOI: 10.3390/s21030844] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 01/20/2021] [Accepted: 01/23/2021] [Indexed: 12/03/2022]
Abstract
With expansion of city scale, the issue of public transport systems will become prominent. For single-swipe buses, the traditional method of obtaining section passenger flow is to rely on surveillance video identification or manual investigation. This paper adopts a new method: collecting wireless signals from mobile terminals inside and outside the bus by installing six Wi-Fi probes in the bus, and use machine learning algorithms to estimate passenger flow of the bus. Five features of signals were selected, and then the three machine learning algorithms of Random Forest, K-Nearest Neighbor, and Support Vector Machines were used to learn the data laws of signal features. Because the signal strength was affected by the complexity of the environment, a strain function was proposed, which varied with the degree of congestion in the bus. Finally, the error between the average of estimation result and the manual survey was 0.1338. Therefore, the method proposed is suitable for the passenger flow identification of single-swiping buses in small and medium-sized cities, which improves the operational efficiency of buses and reduces the waiting pressure of passengers during the morning and evening rush hours in the future.
Collapse
|
63
|
Wan J, Kumar NS, Chan AB. Fine-Grained Crowd Counting. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2114-2126. [PMID: 33439838 DOI: 10.1109/tip.2021.3049938] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Current crowd counting algorithms are only concerned about the number of people in an image, which lacks low-level fine-grained information of the crowd. For many practical applications, the total number of people in an image is not as useful as the number of people in each sub-category. For example, knowing the number of people waiting inline or browsing can help retail stores; knowing the number of people standing/sitting can help restaurants/cafeterias; knowing the number of violent/non-violent people can help police in crowd management. In this article, we propose fine-grained crowd counting, which differentiates a crowd into categories based on the low-level behavior attributes of the individuals (e.g. standing/sitting or violent behavior) and then counts the number of people in each category. To enable research in this area, we construct a new dataset of four real-world fine-grained counting tasks: traveling direction on a sidewalk, standing or sitting, waiting in line or not, and exhibiting violent behavior or not. Since the appearance features of different crowd categories are similar, the challenge of fine-grained crowd counting is to effectively utilize contextual information to distinguish between categories. We propose a two branch architecture, consisting of a density map estimation branch and a semantic segmentation branch. We propose two refinement strategies for improving the predictions of the two branches. First, to encode contextual information, we propose feature propagation guided by the density map prediction, which eliminates the effect of background features during propagation. Second, we propose a complementary attention model to share information between the two branches. Experiment results confirm the effectiveness of our method.
Collapse
|
64
|
Fu H, Wang C, Cui G, She W, Zhao L. Ramie Yield Estimation Based on UAV RGB Images. SENSORS 2021; 21:s21020669. [PMID: 33477949 PMCID: PMC7833380 DOI: 10.3390/s21020669] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 01/03/2021] [Accepted: 01/15/2021] [Indexed: 11/30/2022]
Abstract
Timely and accurate crop growth monitoring and yield estimation are important for field management. The traditional sampling method used for estimation of ramie yield is destructive. Thus, this study proposed a new method for estimating ramie yield based on field phenotypic data obtained from unmanned aerial vehicle (UAV) images. A UAV platform carrying RGB cameras was employed to collect ramie canopy images during the whole growth period. The vegetation indices (VIs), plant number, and plant height were extracted from UAV-based images, and then, these data were incorporated to establish yield estimation model. Among all of the UAV-based image data, we found that the structure features (plant number and plant height) could better reflect the ramie yield than the spectral features, and in structure features, the plant number was found to be the most useful index to monitor the yield, with a correlation coefficient of 0.6. By fusing multiple characteristic parameters, the yield estimation model based on the multiple linear regression was obviously more accurate than the stepwise linear regression model, with a determination coefficient of 0.66 and a relative root mean square error of 1.592 kg. Our study reveals that it is feasible to monitor crop growth based on UAV images and that the fusion of phenotypic data can improve the accuracy of yield estimations.
Collapse
Affiliation(s)
- Hongyu Fu
- Ramie Research Institute of Hunan Agricultural University, College of Agricultural, Hunan Agricultural University, Changsha 410128, China; (H.F.); (W.S.); (L.Z.)
| | - Chufeng Wang
- Macro Agriculture Research Institute, College of Resource and Environment, Huazhong Agricultural University, 1 Shizishan Street, Wuhan 430000, China;
| | - Guoxian Cui
- Ramie Research Institute of Hunan Agricultural University, College of Agricultural, Hunan Agricultural University, Changsha 410128, China; (H.F.); (W.S.); (L.Z.)
- Correspondence:
| | - Wei She
- Ramie Research Institute of Hunan Agricultural University, College of Agricultural, Hunan Agricultural University, Changsha 410128, China; (H.F.); (W.S.); (L.Z.)
| | - Liang Zhao
- Ramie Research Institute of Hunan Agricultural University, College of Agricultural, Hunan Agricultural University, Changsha 410128, China; (H.F.); (W.S.); (L.Z.)
| |
Collapse
|
65
|
Javaid I, Zhang S, Isselmou AEK, Kamhi S, Ahmad IS, Kulsum U. Brain Tumor Classification & Segmentation by Using Advanced DNN, CNN & ResNet-50 Neural Networks. INTERNATIONAL JOURNAL OF CIRCUITS, SYSTEMS AND SIGNAL PROCESSING 2020; 14:1011-1029. [DOI: 10.46300/9106.2020.14.129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
In the medical domain, brain image classification is an extremely challenging field. Medical images play a vital role in making the doctor's precise diagnosis and in the surgery process. Adopting intelligent algorithms makes it feasible to detect the lesions of medical images quickly, and it is especially necessary to extract features from medical images. Several studies have integrated multiple algorithms toward medical images domain. Concerning feature extraction from the medical image, a vast amount of data is analyzed to achieve processing results, helping physicians deliver more precise case diagnoses. Image processing mechanism becomes extensive usage in medical science to advance the early detection and treatment aspects. In this aspect, this paper takes tumor, and healthy images as the research object and primarily performs image processing and data augmentation process to feed the dataset to the neural networks. Deep neural networks (DNN), to date, have shown outstanding achievement in classification and segmentation tasks. Carrying this concept into consideration, in this study, we adopted a pre-trained model Resnet_50 for image analysis. The paper proposed three diverse neural networks, particularly DNN, CNN, and ResNet-50. Finally, the splitting dataset is individually assigned to each simplified neural network. Once the image is classified as a tumor accurately, the OTSU segmentation is employed to extract the tumor alone. It can be examined from the experimental outcomes that the ResNet-50 algorithm shows high accuracy 0.996, precision 1.00 with best F1 score 1.0, and minimum test losses of 0.0269 in terms of Brain tumor classification. Extensive experiments prove our offered tumor detection segmentation efficiency and accuracy. To this end, our approach is comprehensive sufficient and only requires minimum pre-and post-processing, which allows its adoption in various medical image classification & segmentation tasks.
Collapse
Affiliation(s)
- Imran Javaid
- Hebei University of Technology, 8 Dingzigu 1stRd, Hongqiao Qu,China
| | - Shuai Zhang
- Hebei University of Technology, 8 Dingzigu 1stRd, Hongqiao Qu,China
| | | | - Souha Kamhi
- Hebei University of Technology, 8 Dingzigu 1stRd, Hongqiao Qu,China
| | - Isah Salim Ahmad
- Hebei University of Technology, 8 Dingzigu 1stRd, Hongqiao Qu,China
| | - Ummay Kulsum
- Hebei University of Technology, 8 Dingzigu 1stRd, Hongqiao Qu,China
| |
Collapse
|
66
|
|
67
|
Tripathi G, Singh K, Vishwakarma DK. Violence recognition using convolutional neural network: A survey. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-201400] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Violence detection is a challenging task in the computer vision domain. Violence detection framework depends upon the detection of crowd behaviour changes. Violence erupts due to disagreement of an idea, injustice or severe disagreement. The aim of any country is to maintain law and order and peace in the area. Violence detection thus becomes an important task for authorities to maintain peace. Traditional methods have existed for violence detection which are heavily dependent upon hand crafted features. The world is now transitioning in to Artificial Intelligence based techniques. Automatic feature extraction and its classification from images and videos is the new norm in surveillance domain. Deep learning platform has provided us the platter on which non-linear features can be extracted, self-learnt and classified as per the appropriate tool. One such tool is the Convolutional Neural Networks, also known as ConvNets, which has the ability to automatically extract features and classify them in to their respective domain. Till date there is no survey of deciphering violence behaviour techniques using ConvNets. We hope that this survey becomes an exclusive baseline for future violence detection and analysis in the deep learning domain.
Collapse
Affiliation(s)
- Gaurav Tripathi
- Department of Electronics and Communication Engineering, Delhi Technological University, Delhi, India
| | - Kuldeep Singh
- Department of Electronics & Communication Engineering, Malaviya National Institute of Technology, Jaipur, India
| | | |
Collapse
|
68
|
DeepSOCIAL: Social Distancing Monitoring and Infection Risk Assessment in COVID-19 Pandemic. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10217514] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Social distancing is a recommended solution by the World Health Organisation (WHO) to minimise the spread of COVID-19 in public places. The majority of governments and national health authorities have set the 2-m physical distancing as a mandatory safety measure in shopping centres, schools and other covered areas. In this research, we develop a hybrid Computer Vision and YOLOv4-based Deep Neural Network (DNN) model for automated people detection in the crowd in indoor and outdoor environments using common CCTV security cameras. The proposed DNN model in combination with an adapted inverse perspective mapping (IPM) technique and SORT tracking algorithm leads to a robust people detection and social distancing monitoring. The model has been trained against two most comprehensive datasets by the time of the research—the Microsoft Common Objects in Context (MS COCO) and Google Open Image datasets. The system has been evaluated against the Oxford Town Centre dataset (including 150,000 instances of people detection) with superior performance compared to three state-of-the-art methods. The evaluation has been conducted in challenging conditions, including occlusion, partial visibility, and under lighting variations with the mean average precision of 99.8% and the real-time speed of 24.1 fps. We also provide an online infection risk assessment scheme by statistical analysis of the spatio-temporal data from people’s moving trajectories and the rate of social distancing violations. We identify high-risk zones with the highest possibility of virus spread and infection. This may help authorities to redesign the layout of a public place or to take precaution actions to mitigate high-risk zones. The developed model is a generic and accurate people detection and tracking solution that can be applied in many other fields such as autonomous vehicles, human action recognition, anomaly detection, sports, crowd analysis, or any other research areas where the human detection is in the centre of attention.
Collapse
|
69
|
Mohapatra AG, Khanna A, Gupta D, Mohanty M, Albuquerque VHC. An experimental approach to evaluate machine learning models for the estimation of load distribution on suspension bridge using
FBG
sensors and
IoT. Comput Intell 2020. [DOI: 10.1111/coin.12406] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ambarish G. Mohapatra
- Department of Electronics and Instrumentation Engineering Silicon Institute of Technology Bhubaneswar India
| | - Ashish Khanna
- Maharaja Agrasen Institute of Technology Delhi India
| | - Deepak Gupta
- Maharaja Agrasen Institute of Technology Delhi India
| | - Maitri Mohanty
- Computer Science & Engineering GIET University Gunupur India
| | | |
Collapse
|
70
|
Szczepanek R. Analysis of pedestrian activity before and during COVID-19 lockdown, using webcam time-lapse from Cracow and machine learning. PeerJ 2020; 8:e10132. [PMID: 33083150 PMCID: PMC7543725 DOI: 10.7717/peerj.10132] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 09/18/2020] [Indexed: 01/15/2023] Open
Abstract
At the turn of February and March 2020, COVID-19 pandemic reached Europe. Many countries, including Poland imposed lockdown as a method of securing social distance between potentially infected. Stay-at-home orders and movement control within public space not only affected the touristm industry, but also the everyday life of the inhabitants. The hourly time-lapse from four HD webcams in Cracow (Poland) are used in this study to estimate how pedestrian activity changed during COVID-19 lockdown. The collected data covers the period from 9 June 2016 to 19 April 2020 and comes from various urban zones. One zone is tourist, one is residential and two are mixed. In the first stage of the analysis, a state-of-the-art machine learning algorithm (YOLOv3) is used to detect people. Additionally, a non-standard application of the YOLO method is proposed, oriented to the images from HD webcams. This approach (YOLOtiled) is less prone to pedestrian detection errors with the only drawback being the longer computation time. Splitting the HD image into smaller tiles increases the number of detected pedestrians by over 50%. In the second stage, the analysis of pedestrian activity before and during the COVID-19 lockdown is conducted for hourly, daily and weekly averages. Depending on the type of urban zone, the number of pedestrians decreased from 33% in residential zones to 85% in tourist zones located in the Old Town. The presented method allows for more efficient detection and counting of pedestrians from HD time-lapse webcam images compared to SSD, YOLOv3 and Faster R-CNN. The result of the research is a published database with the detected number of pedestrians from the four-year observation period for four locations in Cracow.
Collapse
Affiliation(s)
- Robert Szczepanek
- Faculty of Environmental and Power Engineering, Cracow University of Technology, Cracow, Poland
| |
Collapse
|
71
|
Yılmaz B, Sheikh Abdullah SNH, Kok VJ. Vanishing region loss for crowd density estimation. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.08.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
72
|
Audio-Based Event Detection at Different SNR Settings Using Two-Dimensional Spectrogram Magnitude Representations. ELECTRONICS 2020. [DOI: 10.3390/electronics9101593] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Audio-based event detection poses a number of different challenges that are not encountered in other fields, such as image detection. Challenges such as ambient noise, low Signal-to-Noise Ratio (SNR) and microphone distance are not yet fully understood. If the multimodal approaches are to become better in a range of fields of interest, audio analysis will have to play an integral part. Event recognition in autonomous vehicles (AVs) is such a field at a nascent stage that can especially leverage solely on audio or can be part of the multimodal approach. In this manuscript, an extensive analysis focused on the comparison of different magnitude representations of the raw audio is presented. The data on which the analysis is carried out is part of the publicly available MIVIA Audio Events dataset. Single channel Short-Time Fourier Transform (STFT), mel-scale and Mel-Frequency Cepstral Coefficients (MFCCs) spectrogram representations are used. Furthermore, aggregation methods of the aforementioned spectrogram representations are examined; the feature concatenation compared to the stacking of features as separate channels. The effect of the SNR on recognition accuracy and the generalization of the proposed methods on datasets that were both seen and not seen during training are studied and reported.
Collapse
|
73
|
Yu Y, Zhu H, Wang L, Pedrycz W. Dense crowd counting based on adaptive scene division. INT J MACH LEARN CYB 2020. [DOI: 10.1007/s13042-020-01212-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
74
|
Elbishlawi S, Abdelpakey MH, Eltantawy A, Shehata MS, Mohamed MM. Deep Learning-Based Crowd Scene Analysis Survey. J Imaging 2020; 6:95. [PMID: 34460752 PMCID: PMC8321087 DOI: 10.3390/jimaging6090095] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 08/27/2020] [Accepted: 08/31/2020] [Indexed: 11/30/2022] Open
Abstract
Recently, our world witnessed major events that attracted a lot of attention towards the importance of automatic crowd scene analysis. For example, the COVID-19 breakout and public events require an automatic system to manage, count, secure, and track a crowd that shares the same area. However, analyzing crowd scenes is very challenging due to heavy occlusion, complex behaviors, and posture changes. This paper surveys deep learning-based methods for analyzing crowded scenes. The reviewed methods are categorized as (1) crowd counting and (2) crowd actions recognition. Moreover, crowd scene datasets are surveyed. In additional to the above surveys, this paper proposes an evaluation metric for crowd scene analysis methods. This metric estimates the difference between calculated crowed count and actual count in crowd scene videos.
Collapse
Affiliation(s)
- Sherif Elbishlawi
- The University of British Columbia, 3333 University Way, Kelowna, BC V1V 1V7, Canada; (S.E.); (M.S.S.)
| | | | - Agwad Eltantawy
- The University of British Columbia, 3333 University Way, Kelowna, BC V1V 1V7, Canada; (S.E.); (M.S.S.)
| | - Mohamed S. Shehata
- The University of British Columbia, 3333 University Way, Kelowna, BC V1V 1V7, Canada; (S.E.); (M.S.S.)
| | - Mostafa M. Mohamed
- Electrical and Computer Engineering Department, University of Calgary, AB T2N 1N4, Canada;
| |
Collapse
|
75
|
A hybrid model of convolutional neural networks and deep regression forests for crowd counting. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01688-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
76
|
Dong L, Zhang H, Ji Y, Ding Y. Crowd counting by using multi-level density-based spatial information: A Multi-scale CNN framework. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.04.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
77
|
Jiang X, Zhang L, Lv P, Guo Y, Zhu R, Li Y, Pang Y, Li X, Zhou B, Xu M. Learning Multi-Level Density Maps for Crowd Counting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2705-2715. [PMID: 31562106 DOI: 10.1109/tnnls.2019.2933920] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
People in crowd scenes often exhibit the characteristic of imbalanced distribution. On the one hand, people size varies largely due to the camera perspective. People far away from the camera look smaller and are likely to occlude each other, whereas people near to the camera look larger and are relatively sparse. On the other hand, the number of people also varies greatly in the same or different scenes. This article aims to develop a novel model that can accurately estimate the crowd count from a given scene with imbalanced people distribution. To this end, we have proposed an effective multi-level convolutional neural network (MLCNN) architecture that first adaptively learns multi-level density maps and then fuses them to predict the final output. Density map of each level focuses on dealing with people of certain sizes. As a result, the fusion of multi-level density maps is able to tackle the large variation in people size. In addition, we introduce a new loss function named balanced loss (BL) to impose relatively BL feedback during training, which helps further improve the performance of the proposed network. Furthermore, we introduce a new data set including 1111 images with a total of 49 061 head annotations. MLCNN is easy to train with only one end-to-end training stage. Experimental results demonstrate that our MLCNN achieves state-of-the-art performance. In particular, our MLCNN reaches a mean absolute error (MAE) of 242.4 on the UCF_CC_50 data set, which is 37.2 lower than the second-best result.
Collapse
|
78
|
Crowd Monitoring and Localization Using Deep Convolutional Neural Network: A Review. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10144781] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Crowd management and monitoring is crucial for maintaining public safety and is an important research topic. Developing a robust crowd monitoring system (CMS) is a challenging task as it involves addressing many key issues such as density variation, irregular distribution of objects, occlusions, pose estimation, etc. Crowd gathering at various places like hospitals, parks, stadiums, airports, cultural and religious points are usually monitored by Close Circuit Television (CCTV) cameras. The drawbacks of CCTV cameras are: limited area coverage, installation problems, movability, high power consumption and constant monitoring by the operators. Therefore, many researchers have turned towards computer vision and machine learning that have overcome these issues by minimizing the need of human involvement. This review is aimed to categorize, analyze as well as provide the latest development and performance evolution in crowd monitoring using different machine learning techniques and methods that are published in journals and conferences over the past five years.
Collapse
|
79
|
Sun Y, Jin J, Wu X, Ma T, Yang J. Counting Crowds with Perspective Distortion Correction via Adaptive Learning. SENSORS 2020; 20:s20133781. [PMID: 32640552 PMCID: PMC7374275 DOI: 10.3390/s20133781] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 06/18/2020] [Accepted: 07/03/2020] [Indexed: 12/01/2022]
Abstract
The goal of crowd counting is to estimate the number of people in the image. Presently, use regression to count people number became a mainstream method. It is worth noting that, with the development of convolutional neural networks (CNN), methods that are based on CNN have become a research hotspot. It is a more interesting topic that how to locate the site of the person in the image than simply predicting the number of people in the image. The perspective transformation present is still a challenge, because perspective distortion will cause differences in the size of the crowd in the image. To devote perspective distortion and locate the site of the person more accuracy, we design a novel framework named Adaptive Learning Network (CAL). We use the VGG as the backbone. After each pooling layer is output, we collect the 1/2, 1/4, 1/8, and 1/16 features of the original image and combine them with the weights learned by an adaptive learning branch. The object of our adaptive learning branch is each image in the datasets. By combining the output features of different sizes of each image, the challenge of drastic changes in the size of the image crowd due to perspective transformation is reduced. We conducted experiments on four population counting data sets (i.e., ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF-QNRF), and the results show that our model has a good performance.
Collapse
|
80
|
Wu X, Zheng Y, Ye H, Hu W, Ma T, Yang J, He L. Counting crowds with varying densities via adaptive scenario discovery framework. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.02.045] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
81
|
Zhang J, Zhao B, Yang C, Shi Y, Liao Q, Zhou G, Wang C, Xie T, Jiang Z, Zhang D, Yang W, Huang C, Xie J. Rapeseed Stand Count Estimation at Leaf Development Stages With UAV Imagery and Convolutional Neural Networks. FRONTIERS IN PLANT SCIENCE 2020; 11:617. [PMID: 32587594 PMCID: PMC7298076 DOI: 10.3389/fpls.2020.00617] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 04/22/2020] [Indexed: 06/11/2023]
Abstract
Rapeseed is an important oil crop in China. Timely estimation of rapeseed stand count at early growth stages provides useful information for precision fertilization, irrigation, and yield prediction. Based on the nature of rapeseed, the number of tillering leaves is strongly related to its growth stages. However, no field study has been reported on estimating rapeseed stand count by the number of leaves recognized with convolutional neural networks (CNNs) in unmanned aerial vehicle (UAV) imagery. The objectives of this study were to provide a case for rapeseed stand counting with reference to the existing knowledge of the number of leaves per plant and to determine the optimal timing for counting after rapeseed emergence at leaf development stages with one to seven leaves. A CNN model was developed to recognize leaves in UAV-based imagery, and rapeseed stand count was estimated with the number of recognized leaves. The performance of leaf detection was compared using sample sizes of 16, 24, 32, 40, and 48 pixels. Leaf overcounting occurred when a leaf was much bigger than others as this bigger leaf was recognized as several smaller leaves. Results showed CNN-based leaf count achieved the best performance at the four- to six-leaf stage with F-scores greater than 90% after calibration with overcounting rate. On average, 806 out of 812 plants were correctly estimated on 53 days after planting (DAP) at the four- to six-leaf stage, which was considered as the optimal observation timing. For the 32-pixel patch size, root mean square error (RMSE) was 9 plants with relative RMSE (rRMSE) of 2.22% on 53 DAP, while the mean RMSE was 12 with mean rRMSE of 2.89% for all patch sizes. A sample size of 32 pixels was suggested to be optimal accounting for balancing performance and efficiency. The results of this study confirmed that it was feasible to estimate rapeseed stand count in field automatically, rapidly, and accurately. This study provided a special perspective in phenotyping and cultivation management for estimating seedling count for crops that have recognizable leaves at their early growth stage, such as soybean and potato.
Collapse
Affiliation(s)
- Jian Zhang
- Macro Agriculture Research Institute, College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Arable Land Conservation (Middle and Lower Reaches of Yangtze River), Ministry of Agriculture, Wuhan, China
| | - Biquan Zhao
- Macro Agriculture Research Institute, College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Arable Land Conservation (Middle and Lower Reaches of Yangtze River), Ministry of Agriculture, Wuhan, China
| | - Chenghai Yang
- Aerial Application Technology Research Unit, USDA-Agricultural Research Service, College Station, TX, United States
| | - Yeyin Shi
- Department of Biological Systems Engineering, University of Nebraska–Lincoln, Lincoln, NE, United States
| | - Qingxi Liao
- College of Engineering, Huazhong Agricultural University, Wuhan, China
| | - Guangsheng Zhou
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Chufeng Wang
- Macro Agriculture Research Institute, College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Arable Land Conservation (Middle and Lower Reaches of Yangtze River), Ministry of Agriculture, Wuhan, China
| | - Tianjin Xie
- Macro Agriculture Research Institute, College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Arable Land Conservation (Middle and Lower Reaches of Yangtze River), Ministry of Agriculture, Wuhan, China
| | - Zhao Jiang
- Macro Agriculture Research Institute, College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Arable Land Conservation (Middle and Lower Reaches of Yangtze River), Ministry of Agriculture, Wuhan, China
| | - Dongyan Zhang
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Anhui University, Hefei, China
| | - Wanneng Yang
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Chenglong Huang
- College of Engineering, Huazhong Agricultural University, Wuhan, China
| | - Jing Xie
- College of Science, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
82
|
Kong W, Li H, Zhang X, Zhao G. A multi-context representation approach with multi-task learning for object counting. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105927] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
83
|
J-LDFR: joint low-level and deep neural network feature representations for pedestrian gender classification. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05015-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
84
|
Large-Scale Crowd Analysis through the Use of Passive Radio Sensing Networks. SENSORS 2020; 20:s20092624. [PMID: 32375382 PMCID: PMC7249162 DOI: 10.3390/s20092624] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 04/24/2020] [Accepted: 05/01/2020] [Indexed: 11/17/2022]
Abstract
The creation of an automatic crowd estimation system capable of providing reliable, real-time estimates of human crowd sizes would be an invaluable tool for organizers of large-scale events, particularly so in the context of safety management. We describe a set of experiments in which we installed a passive Radio Frequency (RF) sensor network in different environments containing thousands of human individuals and discuss the accuracy with which the resulting measurements can be used to estimate the sizes of these crowds. Depending on the selected training approach, a median crowd estimation error of 184 people could be obtained for a large scale environment which contained 3227 people at its peak. Additionally, we look into the potential benefits of dividing one of our experimental environments into multiple subregions and open up a potentially interesting new topic of research regarding the estimation of crowd flows. Finally, we investigate the combination of our measurements with another sources of crowd-related data: sales data from drink stands within the environment. In doing so, we aim to integrate the concept of an automatic RF-based crowd estimation system into the broader domain of crowd analysis.
Collapse
|
85
|
Redesigned Skip-Network for Crowd Counting with Dilated Convolution and Backward Connection. J Imaging 2020; 6:jimaging6050028. [PMID: 34460730 PMCID: PMC8321029 DOI: 10.3390/jimaging6050028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 04/27/2020] [Accepted: 04/29/2020] [Indexed: 11/17/2022] Open
Abstract
Crowd counting is a challenging task dealing with the variation of an object scale and a crowd density. Existing works have emphasized on skip connections by integrating shallower layers with deeper layers, where each layer extracts features in a different object scale and crowd density. However, only high-level features are emphasized while ignoring low-level features. This paper proposes an estimation network by passing high-level features to shallow layers and emphasizing its low-level feature. Since an estimation network is a hierarchical network, a high-level feature is also emphasized by an improved low-level feature. Our estimation network consists of two identical networks for extracting a high-level feature and estimating the final result. To preserve semantic information, dilated convolution is employed without resizing the feature map. Our method was tested in three datasets for counting humans and vehicles in a crowd image. The counting performance is evaluated by mean absolute error and root mean squared error indicating the accuracy and robustness of an estimation network, respectively. The experimental result shows that our network outperforms other related works in a high crowd density and is effective for reducing over-counting error in the overall case.
Collapse
|
86
|
Yang B, Zhan W, Wang N, Liu X, Lv J. Counting crowds using a scale-distribution-aware network and adaptive human-shaped kernel. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.02.071] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
87
|
Abstract
We propose a symmetric method of accurately estimating the number of metro passengers from an individual image. To this end, we developed a network for metro-passenger counting called MPCNet, which provides a data-driven and deep learning method of understanding highly congested scenes and accurately estimating crowds, as well as presenting high-quality density maps. The proposed MPCNet is composed of two major components: A deep convolutional neural network (CNN) as the front end, for deep feature extraction; and a multi-column atrous CNN as the back-end, with atrous spatial pyramid pooling (ASPP) to deliver multi-scale reception fields. Existing crowd-counting datasets do not adequately cover all the challenging situations considered in our work. Therefore, we collected specific subway passenger video to compile and label a large new dataset that includes 346 images with 3475 annotated heads. We conducted extensive experiments with this and other datasets to verify the effectiveness of the proposed model. Our results demonstrate the excellent performance of the proposed MPCNet.
Collapse
|
88
|
Dong Z, Zhang R, Shao X, Li Y. Scale-Recursive Network with point supervision for crowd scene analysis. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.070] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
89
|
|
90
|
Ilyas N, Shahzad A, Kim K. Convolutional-Neural Network-Based Image Crowd Counting: Review, Categorization, Analysis, and Performance Evaluation. SENSORS 2019; 20:s20010043. [PMID: 31861734 PMCID: PMC6983207 DOI: 10.3390/s20010043] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Revised: 12/10/2019] [Accepted: 12/13/2019] [Indexed: 12/18/2022]
Abstract
Traditional handcrafted crowd-counting techniques in an image are currently transformed via machine-learning and artificial-intelligence techniques into intelligent crowd-counting techniques. This paradigm shift offers many advanced features in terms of adaptive monitoring and the control of dynamic crowd gatherings. Adaptive monitoring, identification/recognition, and the management of diverse crowd gatherings can improve many crowd-management-related tasks in terms of efficiency, capacity, reliability, and safety. Despite many challenges, such as occlusion, clutter, and irregular object distribution and nonuniform object scale, convolutional neural networks are a promising technology for intelligent image crowd counting and analysis. In this article, we review, categorize, analyze (limitations and distinctive features), and provide a detailed performance evaluation of the latest convolutional-neural-network-based crowd-counting techniques. We also highlight the potential applications of convolutional-neural-network-based crowd-counting techniques. Finally, we conclude this article by presenting our key observations, providing strong foundation for future research directions while designing convolutional-neural-network-based crowd-counting techniques. Further, the article discusses new advancements toward understanding crowd counting in smart cities using the Internet of Things (IoT).
Collapse
Affiliation(s)
- Naveed Ilyas
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea;
| | - Ahsan Shahzad
- Department of Computer and Software Engineering (DCSE), College of Electrical and Mechanical Engineering (EME), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan;
| | - Kiseon Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea;
- Correspondence:
| |
Collapse
|
91
|
Tserpes K, Pateraki M, Varlamis I. Strand: scalable trilateration with Node.js. JOURNAL OF CLOUD COMPUTING: ADVANCES, SYSTEMS AND APPLICATIONS 2019. [DOI: 10.1186/s13677-019-0142-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Abstract
This work reports on the development details and results of an experimental setup for the localization of the attendants of a music festival. The application had to be reporting in real-time the asymmetric crowd density based on the Received Signal Strength Indicator (RSSI) between the attendants’ smartphones and an experimental installation of 24 WiFi access points. The impermanent nature of the application led to the implementation of a cloud-based solution, called “STRAND”. STRAND is based on Node.js components, which communicate through websockets, collect, process and exchange data and continuously report the produced information to the end-user. To cope with the near real-time requirements, and the volatility of the crowd concentration density, STRAND horizontally scales the trilateration component, i.e. the component that estimates the user location based on distance measurements. STRAND was tested during the festival days in July 2018 and the results show a system that copes with very high loads and achieves the temporal and accuracy requirements the were set.
Collapse
|
92
|
|
93
|
Zou Z, Cheng Y, Qu X, Ji S, Guo X, Zhou P. Attend to count: Crowd counting with adaptive capacity multi-scale CNNs. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
94
|
Vandoni J, Le Hégarat-Mascle S, Aldea E. Augmenting Deep Learning Performance in an Evidential Multiple Classifier System. SENSORS 2019; 19:s19214664. [PMID: 31717870 PMCID: PMC6864766 DOI: 10.3390/s19214664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 10/23/2019] [Accepted: 10/24/2019] [Indexed: 11/16/2022]
Abstract
The main objective of this work is to study the applicability of ensemble methods in the context of deep learning with limited amounts of labeled data. We exploit an ensemble of neural networks derived using Monte Carlo dropout, along with an ensemble of SVM classifiers which owes its effectiveness to the hand-crafted features used as inputs and to an active learning procedure. In order to leverage each classifier’s respective strengths, we combine them in an evidential framework, which models specifically their imprecision and uncertainty. The application we consider in order to illustrate the interest of our Multiple Classifier System is pedestrian detection in high-density crowds, which is ideally suited for its difficulty, cost of labeling and intrinsic imprecision of annotation data. We show that the fusion resulting from the effective modeling of uncertainty allows for performance improvement, and at the same time, for a deeper interpretation of the result in terms of commitment of the decision.
Collapse
Affiliation(s)
- Jennifer Vandoni
- SATIE-CNRS UMR 8029, Paris-Sud University, Paris-Saclay University, 91405 Orsay CEDEX, France; (J.V.); (E.A.)
- SAFRAN SA, Safran Tech, Pole Technologie du Signal et de l’Information, 78772 Magny-les-Hameaux, France
| | - Sylvie Le Hégarat-Mascle
- SATIE-CNRS UMR 8029, Paris-Sud University, Paris-Saclay University, 91405 Orsay CEDEX, France; (J.V.); (E.A.)
- Correspondence: ; Tel.: +33-169-154-036
| | - Emanuel Aldea
- SATIE-CNRS UMR 8029, Paris-Sud University, Paris-Saclay University, 91405 Orsay CEDEX, France; (J.V.); (E.A.)
| |
Collapse
|
95
|
Zhang Y, Zhou C, Chang F, Kot AC. A scale adaptive network for crowd counting. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.07.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
96
|
Liu L, Amirgholipour S, Jiang J, Jia W, Zeibots M, He X. Performance-enhancing network pruning for crowd counting. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.06.035] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
97
|
Liu X, Weijer JVD, Bagdanov AD. Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:1862-1878. [PMID: 30794168 DOI: 10.1109/tpami.2019.2899857] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent.
Collapse
|
98
|
Sindagi VA, Patel VM. HA-CCN: Hierarchical Attention-based Crowd Counting Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:323-335. [PMID: 31329118 DOI: 10.1109/tip.2019.2928634] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Single image-based crowd counting has recently witnessed increased focus, but many leading methods are far from optimal, especially in highly congested scenes. In this paper, we present Hierarchical Attention-based Crowd Counting Network (HA-CCN) that employs attention mechanisms at various levels to selectively enhance the features of the network. The proposed method, which is based on the VGG16 network, consists of a spatial attention module (SAM) and a set of global attention modules (GAM). SAM enhances low-level features in the network by infusing spatial segmentation information, whereas the GAM focuses on enhancing channel-wise information in the higher level layers. The proposed method is a single-step training framework, simple to implement and achieves state-of-the-art results on different datasets. Furthermore, we extend the proposed counting network by introducing a novel set-up to adapt the network to different scenes and datasets via weak supervision using image-level labels. This new set up reduces the burden of acquiring labour intensive point-wise annotations for new datasets while improving the cross-dataset performance.
Collapse
|
99
|
Bellocchio E, Ciarfuglia TA, Costante G, Valigi P. Weakly Supervised Fruit Counting for Yield Estimation Using Spatial Consistency. IEEE Robot Autom Lett 2019. [DOI: 10.1109/lra.2019.2903260] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
100
|
Ma J, Dai Y, Tan YP. Atrous convolutions spatial pyramid network for crowd counting and density estimation. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.03.065] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|