Three-stream fusion network for first-person interaction recognition

Ye-Ji Kim,Dong-Gyu Lee,Seong-Whan Lee

doi:10.1016/j.patcog.2020.107279

Abstract

First-person interaction recognition is a challenging task because of unstable video conditions resulting from the camera wearer’s movement. For human interaction recognition from a first-person viewpoint, this paper proposes a three-stream fusion network with two main parts: three-stream architecture and three-stream correlation fusion. The three-stream architecture captures the characteristics of the target appearance, target motion, and camera ego-motion. Meanwhile the three-stream correlation fusion combines the feature map of each of the three streams to consider the correlations among the target appearance, target motion, and camera ego-motion. The fused feature vector is robust to the camera movement and compensates for the noise of the camera ego-motion. Short-term intervals are modeled using the fused feature vector, and a long short-term memory (LSTM) model considers the temporal dynamics of the video. We evaluated the proposed method on two public benchmark datasets to validate the effectiveness of our approach. The experimental results show that the proposed fusion method successfully generated a discriminative feature vector, and our network outperformed all competing activity recognition methods in first-person videos where considerable camera ego-motion occurs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Three-stream fusion network for first-person interaction recognition

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Journal: Pattern Recognition	Publication Date: Feb 18, 2020
Citations: 5

Similar Papers

Hand Gesture Recognition using Deep Feature Fusion Network based on Wearable Sensors
Guan Yuan ... Xiao Liu
IEEE Sensors Journal | VOL. 21
Guan Yuan, et. al.Guan Yuan ... Xiao Liu
01 Jan 2020
IEEE Sensors Journal | VOL. 21

Machine-learning-based model and simulation analysis of PM2.5 concentration prediction in Beijing
...
工程科学学报 | VOL. 41
, et. al. ...
20 Mar 2019
工程科学学报 | VOL. 41

Forecasting daily PM2.5 concentrations in Wuhan with a spatial-autocorrelation-based long short-term memory model
Zhifei Liu ... Yixuan Zhang
Atmospheric Environment | VOL. 331
Zhifei Liu, et. al.Zhifei Liu ... Yixuan Zhang
23 May 2024
Atmospheric Environment | VOL. 331

Application of wavelet-based multivariate long short-term memory models in prediction of stage for Teesta River, India
Swarnadeepa Chakraborty ... Sujata Biswas
Journal of Hydroinformatics | VOL. -
Swarnadeepa Chakraborty, et. al.Swarnadeepa Chakraborty ... Sujata Biswas
07 Aug 2024
Journal of Hydroinformatics | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Three-stream fusion network for first-person interaction recognition

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition