Abstract

Segment-level video semantic recognition, which known to be an important task in video analysis, attempts to identify the semantic concepts in short video clips.Labeling video segments is difficult because there is an extremely large number of segments and there are no network tags; consequently, only a portion of the video segments are labeled. Determining how to improve the accuracy of semantic recognition of fragmented videos with limited semantic labels is a key challenge in video semantic recognition.This paper proposes a video semantic recognition algorithm based on the consistency of video- and segment-level predictions. The proposed algorithm introduces the constraint of consistency between complete video semantics and fragmentary video semantics. The proposed algorithm can be applied to filter the video segment semantic results to improve recognition accuracy. The proposed algorithm achieved 82.62% mean average precision on the video segment semantic recognition task using the large-scale video dataset YouTube-8M and ranked second in the third YouTube-8M competition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.