LCMA-Net: A light cross-modal attention network for streamer re-identification in live video

Jiacheng Yao,Jing Zhang,Hui Zhang,Li Zhuo

doi:10.1016/j.cviu.2024.104183

Abstract

With the rapid expansion of the we-media industry, streamers have increasingly incorporated inappropriate content into live videos to attract traffic and pursue interests. Blacklisted streamers often forge their identities or switch platforms to continue streaming, causing significant harm to the online environment. Consequently, streamer re-identification (re-ID) has become of paramount importance. Streamer biometrics in live videos exhibit multimodal characteristics, including voiceprints, faces, and spatiotemporal information, which complement each other. Therefore, we propose a light cross-modal attention network (LCMA-Net) for streamer re-ID in live videos. First, the voiceprint, face, and spatiotemporal features of the streamer are extracted by RawNet-SA, Π-Net, and STDA-ResNeXt3D, respectively. We then design a light cross-modal pooling attention (LCMPA) module, which, combined with a multilayer perceptron (MLP), aligns and concatenates different modality features into multimodal features within the LCMA-Net. Finally, the streamer is re-identified by measuring the similarity between these multimodal features. Five experiments were conducted on the StreamerReID dataset, and the results demonstrated that the proposed method achieved competitive performance. The dataset and code are available at https://github.com/BJUT-AIVBD/LCMA-Net.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

LCMA-Net: A light cross-modal attention network for streamer re-identification in live video

Abstract

Talk to us

Similar Papers

More From: Computer Vision and Image Understanding

Lead the way for us

Similar Papers

CCGL-YOLOV5:A cross-modal cross-scale global-local attention YOLOV5 lung tumor detection model
Tao Zhou ... Huiling Lu
Computers in Biology and Medicine | VOL. 165
Tao Zhou, et. al.Tao Zhou ... Huiling Lu
28 Aug 2023
Computers in Biology and Medicine | VOL. 165

Multilevel fusion of multimodal deep features for porn streamer recognition in live video
Liyuan Wang ... Li Zhuo
Pattern Recognition Letters | VOL. 140
Liyuan Wang, et. al.Liyuan Wang ... Li Zhuo
25 Sep 2020
Pattern Recognition Letters | VOL. 140

Porn Streamer Recognition in Live Video Streaming via Attention-Gated Multimodal Deep Features
Liyuan Wang ... Li Zhuo
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 30
Liyuan Wang, et. al.Liyuan Wang ... Li Zhuo
26 Dec 2019
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 30

Emotional analysis of joint sports quality expansion tasks based on multi-modal feature fusion
Huijing Li ... Hong Sun
Systems and Soft Computing | VOL. 6
Huijing Li, et. al.Huijing Li ... Hong Sun
02 Apr 2024
Systems and Soft Computing | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

LCMA-Net: A light cross-modal attention network for streamer re-identification in live video

Abstract

Talk to us

Similar Papers

More From: Computer Vision and Image Understanding