1
|
Chen J, Yang G, Zhang Z, Wang W. ST-D3DDARN: Urban traffic flow prediction based on spatio-temporal decoupled 3D DenseNet with attention ResNet. PLoS One 2024; 19:e0305424. [PMID: 38865366 PMCID: PMC11168702 DOI: 10.1371/journal.pone.0305424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 05/29/2024] [Indexed: 06/14/2024] Open
Abstract
Urban traffic flow prediction plays a crucial role in intelligent transportation systems (ITS), which can enhance traffic efficiency and ensure public safety. However, predicting urban traffic flow faces numerous challenges, such as intricate temporal dependencies, spatial correlations, and the influence of external factors. Existing research methods cannot fully capture the complex spatio-temporal dependence of traffic flow. Inspired by video analysis in computer vision, we represent traffic flow as traffic frames and propose an end-to-end urban traffic flow prediction model named Spatio-temporal Decoupled 3D DenseNet with Attention ResNet (ST-D3DDARN). Specifically, this model extracts multi-source traffic flow features through closeness, period, trend, and external factor branches. Subsequently, it dynamically establishes global spatio-temporal correlations by integrating spatial self-attention and coordinate attention in a residual network, accurately predicting the inflow and outflow of traffic throughout the city. In order to evaluate the effectiveness of the ST-D3DDARN model, experiments are carried out on two publicly available real-world datasets. The results indicate that ST-D3DDARN outperforms existing models in terms of single-step prediction, multi-step prediction, and efficiency.
Collapse
Affiliation(s)
- Jing Chen
- College of Information Technology and Engineering, Tianjin University of Technology and Education, Tianjin, China
| | - Guowei Yang
- College of Information Technology and Engineering, Tianjin University of Technology and Education, Tianjin, China
| | - Zhaochong Zhang
- College of Information Technology and Engineering, Tianjin University of Technology and Education, Tianjin, China
| | - Wei Wang
- College of Information Technology and Engineering, Tianjin University of Technology and Education, Tianjin, China
| |
Collapse
|
2
|
Papadakis A, Spyrou E. A Multi-Modal Egocentric Activity Recognition Approach towards Video Domain Generalization. SENSORS (BASEL, SWITZERLAND) 2024; 24:2491. [PMID: 38676108 PMCID: PMC11054491 DOI: 10.3390/s24082491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/08/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024]
Abstract
Egocentric activity recognition is a prominent computer vision task that is based on the use of wearable cameras. Since egocentric videos are captured through the perspective of the person wearing the camera, her/his body motions severely complicate the video content, imposing several challenges. In this work we propose a novel approach for domain-generalized egocentric human activity recognition. Typical approaches use a large amount of training data, aiming to cover all possible variants of each action. Moreover, several recent approaches have attempted to handle discrepancies between domains with a variety of costly and mostly unsupervised domain adaptation methods. In our approach we show that through simple manipulation of available source domain data and with minor involvement from the target domain, we are able to produce robust models, able to adequately predict human activity in egocentric video sequences. To this end, we introduce a novel three-stream deep neural network architecture combining elements of vision transformers and residual neural networks which are trained using multi-modal data. We evaluate the proposed approach using a challenging, egocentric video dataset and demonstrate its superiority over recent, state-of-the-art research works.
Collapse
Affiliation(s)
- Antonios Papadakis
- Department of Informatics and Telecommunications, National Kapodistrian University of Athens, 15772 Athens, Greece;
| | - Evaggelos Spyrou
- Department of Informatics and Telecommunications, University of Thessaly, 35100 Lamia, Greece
| |
Collapse
|
3
|
Yin X, Zhong J, Lian D, Cao W. An adaptively multi-correlations aggregation network for skeleton-based motion recognition. Sci Rep 2023; 13:19138. [PMID: 37932348 PMCID: PMC10628167 DOI: 10.1038/s41598-023-46155-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 10/28/2023] [Indexed: 11/08/2023] Open
Abstract
Previous work based on Graph Convolutional Networks (GCNs) has shown promising performance in 3D skeleton-based motion recognition. We believe that the 3D skeleton-based motion recognition problem can be explained as a modeling task of dynamic skeleton-based graph construction. However, existing methods fail to model human poses with dynamic correlations between human joints, ignoring the information contained in the skeleton structure of the non-connected relationship during human motion modeling. In this paper, we propose an Adaptively Multi-correlations Aggregation Network(AMANet) to capture dynamic joint dependencies embedded in skeleton graphs, which includes three key modules: the Spatial Feature Extraction Module (SFEM), Temporal Feature Extraction Module (TFEM), and Spatio-Temporal Feature Extraction Module (STFEM). In addition, we deploy the relative coordinates of the joints of various parts of the human body via moving frames of Differential Geometry. On this basis, we design a Data Preprocessing Module (DP), enriching the characteristics of the original skeleton data. Extensive experiments are conducted on three public datasets(NTU-RGB+D 60, NTU-RGB+D 120, and Kinetics-Skeleton 400), demonstrating our proposed method's effectiveness.
Collapse
Affiliation(s)
- Xinpeng Yin
- Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen University, Yuehai Street, Shenzhen, 518060, China
| | - Jianqi Zhong
- Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen University, Yuehai Street, Shenzhen, 518060, China
| | - Deliang Lian
- Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen University, Yuehai Street, Shenzhen, 518060, China
| | - Wenming Cao
- Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen University, Yuehai Street, Shenzhen, 518060, China.
- State Key Laboratory of Radio Frequency Heterogeneous Integration, Shenzhen University, Yuehai Street, Shenzhen, 51886, China.
| |
Collapse
|
4
|
Diraco G, Rescio G, Siciliano P, Leone A. Review on Human Action Recognition in Smart Living: Sensing Technology, Multimodality, Real-Time Processing, Interoperability, and Resource-Constrained Processing. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23115281. [PMID: 37300008 DOI: 10.3390/s23115281] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 05/23/2023] [Accepted: 05/30/2023] [Indexed: 06/12/2023]
Abstract
Smart living, a concept that has gained increasing attention in recent years, revolves around integrating advanced technologies in homes and cities to enhance the quality of life for citizens. Sensing and human action recognition are crucial aspects of this concept. Smart living applications span various domains, such as energy consumption, healthcare, transportation, and education, which greatly benefit from effective human action recognition. This field, originating from computer vision, seeks to recognize human actions and activities using not only visual data but also many other sensor modalities. This paper comprehensively reviews the literature on human action recognition in smart living environments, synthesizing the main contributions, challenges, and future research directions. This review selects five key domains, i.e., Sensing Technology, Multimodality, Real-time Processing, Interoperability, and Resource-Constrained Processing, as they encompass the critical aspects required for successfully deploying human action recognition in smart living. These domains highlight the essential role that sensing and human action recognition play in successfully developing and implementing smart living solutions. This paper serves as a valuable resource for researchers and practitioners seeking to further explore and advance the field of human action recognition in smart living.
Collapse
Affiliation(s)
- Giovanni Diraco
- National Research Council of Italy, Institute for Microelectronics and Microsystems, 73100 Lecce, Italy
| | - Gabriele Rescio
- National Research Council of Italy, Institute for Microelectronics and Microsystems, 73100 Lecce, Italy
| | - Pietro Siciliano
- National Research Council of Italy, Institute for Microelectronics and Microsystems, 73100 Lecce, Italy
| | - Alessandro Leone
- National Research Council of Italy, Institute for Microelectronics and Microsystems, 73100 Lecce, Italy
| |
Collapse
|
5
|
Li S, Liu Y. Human motion recognition based on Nano-CMOS Image sensor. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:10135-10152. [PMID: 37322926 DOI: 10.3934/mbe.2023444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Human motion recognition is of great value in the fields of intelligent monitoring systems, driver assistance system, advanced human-computer interaction, human motion analysis, image and video processing. However, the current human motion recognition methods have the problem of poor recognition effect. Therefore, we propose a human motion recognition method based on Nano complementary metal oxide semiconductor (CMOS) image sensor. First, using the Nano-CMOS image sensor to transform and process the human motion image, and combines the background mixed model of pixels in the human motion image to extract the human motion features, and feature selection is conducted. Second, according to the three-dimensional scanning features of Nano-CMOS image sensor, the human joint coordinate information data is collected, the state variables of human motion are sensed by the sensor, and the human motion model is constructed according to the measurement matrix of human motions. Finally, the foreground features of human motion images are obtained by calculating the feature parameters of each motion gesture. According to the posterior conditional probability of human motion images, the recognition objective function of human motion is obtained to realize human motion recognition. The results show that the human motion recognition effect of the proposed method is good, the extraction accuracy is high, the average human motion recognition rate is 92%, the classification accuracy is high, and the recognition speed is up to 186 frames/s.
Collapse
Affiliation(s)
- Shangbin Li
- Physical Education Department, Harbin Engineering University, Harbin 150001, China
| | - Yu Liu
- Physical Education Department, Harbin Engineering University, Harbin 150001, China
| |
Collapse
|
6
|
ST-3DGMR:Spatio-temporal 3D Grouped Multiscale ResNet Network For Region-based Urban Traffic Flow Prediction. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.12.066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
7
|
Momin MS, Sufian A, Barman D, Dutta P, Dong M, Leo M. In-Home Older Adults' Activity Pattern Monitoring Using Depth Sensors: A Review. SENSORS (BASEL, SWITZERLAND) 2022; 22:9067. [PMID: 36501769 PMCID: PMC9735577 DOI: 10.3390/s22239067] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 11/10/2022] [Accepted: 11/15/2022] [Indexed: 06/17/2023]
Abstract
The global population is aging due to many factors, including longer life expectancy through better healthcare, changing diet, physical activity, etc. We are also witnessing various frequent epidemics as well as pandemics. The existing healthcare system has failed to deliver the care and support needed to our older adults (seniors) during these frequent outbreaks. Sophisticated sensor-based in-home care systems may offer an effective solution to this global crisis. The monitoring system is the key component of any in-home care system. The evidence indicates that they are more useful when implemented in a non-intrusive manner through different visual and audio sensors. Artificial Intelligence (AI) and Computer Vision (CV) techniques may be ideal for this purpose. Since the RGB imagery-based CV technique may compromise privacy, people often hesitate to utilize in-home care systems which use this technology. Depth, thermal, and audio-based CV techniques could be meaningful substitutes here. Due to the need to monitor larger areas, this review article presents a systematic discussion on the state-of-the-art using depth sensors as primary data-capturing techniques. We mainly focused on fall detection and other health-related physical patterns. As gait parameters may help to detect these activities, we also considered depth sensor-based gait parameters separately. The article provides discussions on the topic in relation to the terminology, reviews, a survey of popular datasets, and future scopes.
Collapse
Affiliation(s)
- Md Sarfaraz Momin
- Department of Computer Science, Kaliachak College, University of Gour Banga, Malda 732101, India
- Department of Computer & System Sciences, Visva-Bharati University, Bolpur 731235, India
| | - Abu Sufian
- Department of Computer Science, University of Gour Banga, Malda 732101, India
| | - Debaditya Barman
- Department of Computer & System Sciences, Visva-Bharati University, Bolpur 731235, India
| | - Paramartha Dutta
- Department of Computer & System Sciences, Visva-Bharati University, Bolpur 731235, India
| | - Mianxiong Dong
- Department of Science and Informatics, Muroran Institute of Technology, Muroran 050-8585, Hokkaido, Japan
| | - Marco Leo
- National Research Council of Italy, Institute of Applied Sciences and Intelligent Systems, 73100 Lecce, Italy
| |
Collapse
|
8
|
Yadav SK, Agarwal A, Kumar A, Tiwari K, Pandey HM, Akbar SA. YogNet: A two-stream network for realtime multiperson yoga action recognition and posture correction. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
Liu Z, Cheng J, Liu L, Ren Z, Zhang Q, Song C. Dual-stream cross-modality fusion transformer for RGB-D action recognition. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
10
|
Deep spatio-temporal 3D densenet with multiscale ConvLSTM-Resnet network for citywide traffic flow forecasting. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109054] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
11
|
Liu Y, Zhang H, Xu D, He K. Graph transformer network with temporal kernel attention for skeleton-based action recognition. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108146] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
12
|
Ma C, Lin C, Samuel OW, Guo W, Zhang H, Greenwald S, Xu L, Li G. A Bi-Directional LSTM Network for Estimating Continuous Upper Limb Movement From Surface Electromyography. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3097272] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|