Real-time spatiotemporal action localization algorithm using improved CNNs architecture

Hengshuai Liu,Jianjun Li,Jiale Tong,Guang Li,Qian Wang,Ming Zhang

doi:10.1038/s41598-024-73622-2

Abstract

This paper aims to propose a faster and more accurate network for human spatiotemporal action localization tasks. Like the YOWO model, we also use convolutional neural networks (CNNs) for feature extraction, but our model differs from YOWO in three significant ways: firstly, we don’t use the feature fusion strategy, we only use spatial features extracted by 2D CNNs for action localization and spatiotemporal features extracted by 3D CNNs for action recognition; secondly, we make an improvement to the 2D CNNs network by introducing a coordinate attention mechanism and utilize the CIoU loss instead of the coordinate offset loss for bounding box regression; thirdly, we provide a more lightweight and faster spatiotemporal action localization architecture, which reduces the number of parameters by 21.76 million and achieves a speed of 39 fps on 16-frame input clips compared to the YOWO model. We test our model’s performance on three public datasets: UCF-Sports, JHMDB-21 and UCF101-24. Compared with the YOWO model, we improve frame-mAP (@IoU 0.5) by 17.09% and 7.15% on the UCF-Sports and JHMDB-21 datasets, and for video-mAP, we improve by 2.7%, 8.7% and 14.4% at IoU thresholds of 0.2, 0.5 and 0.75 on the JHMDB-21 dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Real-time spatiotemporal action localization algorithm using improved CNNs architecture

Abstract

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Journal: Scientific Reports	Publication Date: Oct 21, 2024
License type: CC BY-NC-ND 4.0

Similar Papers

Chapter 14 - Human activity recognition
Lukas Hedegaard ... Alexandros Iosifidis
Deep Learning for Robot Perception and Cognition | VOL. -
Lukas Hedegaard, et. al.Lukas Hedegaard ... Alexandros Iosifidis
01 Jan 2021
Deep Learning for Robot Perception and Cognition | VOL. -

Learning motion representation for real-time spatio-temporal action localization
Dejun Zhang ... Boxiong Yang
Pattern Recognition | VOL. 103
Dejun Zhang, et. al.Dejun Zhang ... Boxiong Yang
27 Feb 2020
Pattern Recognition | VOL. 103

Region of Interest Detection in Fundus Images Using Deep Learning and Blood Vessel Information
Jongwoo Kim ... Emily Y Chew
Control theory & applications | VOL. -
Jongwoo Kim, et. al.Jongwoo Kim ... Emily Y Chew
01 Jun 2018
Control theory & applications | VOL. -

A new image classification method using CNN transfer learning and web data augmentation
Dongmei Han ... Weiguo Fan
Expert Systems with Applications | VOL. 95
Dongmei Han, et. al.Dongmei Han ... Weiguo Fan
13 Nov 2017
Expert Systems with Applications | VOL. 95

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Real-time spatiotemporal action localization algorithm using improved CNNs architecture

Abstract

Talk to us

Similar Papers

More From: Scientific Reports