Streamer action recognition in live video with spatial-temporal attention and deep dictionary learning

Chenhao Li,Jing Zhang,Jiacheng Yao

doi:10.1016/j.neucom.2020.07.148

Abstract

Live video hosted by streamer is being sought after by more and more Internet users. A few streamers show inappropriate action in normal live video content for profit and popularity, who bring great harm to the network environment. In order to effectively regulate the streamer behavior in live video, a streamer action recognition method in live video with spatial-temporal attention and deep dictionary learning is proposed in this paper. First, deep features with spatial context are extracted by a spatial attention network to focus on action region of streamer after sampling video frames from live video. Then, deep features of video are fused by assigning weights with a temporal attention network to learn the frame attention from an action. Finally, deep dictionary learning is used to sparsely represent the deep features to further recognize streamer actions. Four experiments are conducted on a real-world dataset, and the competitive results demonstrate that our method can improve the accuracy and speed of streamer action recognition in live video.

Full Text