Video Content Classification Research Articles

The explosive growth of online short videos has brought great challenges to the efficient management of video content classification, retrieval, and recommendation. Video features for video management can be extracted from video image frames by various algorithms, and they have been proven to be effective in the video classification of sensor systems. However, frame-by-frame processing of video image frames not only requires huge computing power, but also classification algorithms based on a single modality of video features cannot meet the accuracy requirements in specific scenarios. In response to these concerns, we introduce a short video categorization architecture centered around cross-modal fusion in visual sensor systems which jointly utilizes video features and text features to classify short videos, avoiding processing a large number of image frames during classification. Firstly, the image space is extended to three-dimensional space-time by a self-attention mechanism, and a series of patches are extracted from a single image frame. Each patch is linearly mapped into the embedding layer of the Timesformer network and augmented with positional information to extract video features. Second, the text features of subtitles are extracted through the bidirectional encoder representation from the Transformers (BERT) pre-training model. Finally, cross-modal fusion is performed based on the extracted video and text features, resulting in improved accuracy for short video classification tasks. The outcomes of our experiments showcase a substantial superiority of our introduced classification framework compared to alternative baseline video classification methodologies. This framework can be applied in sensor systems for potential video classification.

Read full abstract

AbstractWith the emergence of Web 2.0, sharing personal content, communicating ideas, and interacting with other online users in Web 2.0 communities have become daily routines for online users. User‐generated data from Web 2.0 sites provide rich personal information (e.g., personal preferences and interests) and can be utilized to obtain insight about cyber communities and their social networks. Many studies have focused on leveraging user‐generated information to analyze blogs and forums, but few studies have applied this approach to video‐sharing Web sites. In this study, we propose a text‐based framework for video content classification of online‐video sharing Web sites. Different types of user‐generated data (e.g., titles, descriptions, and comments) were used as proxies for online videos, and three types of text features (lexical, syntactic, and content‐specific features) were extracted. Three feature‐based classification techniques (C4.5, Naïve Bayes, and Support Vector Machine) were used to classify videos. To evaluate the proposed framework, user‐generated data from candidate videos, which were identified by searching user‐given keywords on YouTube, were first collected. Then, a subset of the collected data was randomly selected and manually tagged by users as our experiment data. The experimental results showed that the proposed approach was able to classify online videos based on users' interests with accuracy rates up to 87.2%, and all three types of text features contributed to discriminating videos. Support Vector Machine outperformed C4.5 and Naïve Bayes techniques in our experiments. In addition, our case study further demonstrated that accurate video‐classification results are very useful for identifying implicit cyber communities on video‐sharing Web sites.

Read full abstract

Video Content Classification Research Articles

Related Topics

Articles published on Video Content Classification

A Short Video Classification Framework Based on Cross-Modal Fusion.

Chaser Priori Wolf (CPW) Optimization an Improved Optimization Technique Video Content Classification and Detection

Evaluation of the content of YouTubeTM videos about local anesthesia in pediatric dentistry

ENCVIDC: an innovative approach for encoded video content classification

An Automatic Classification Method of Sports Teaching Video Using Support Vector Machine

Research on Image-Based Movement Accuracy Monitoring of Aerobics

Video mining for facial action unit classification using statistical spatial–temporal feature image and LoG deep convolutional neural network

Approaches to Detect Online Radicalization: A Survey

Text-based video content classification for online video-sharing sites

An automatic video content classification scheme based on combined visual features model with modified DAGSVM

Text‐based video content classification for online video‐sharing sites

Video content classification based on 3-D Eigen analysis

Context and memory in multimedia content analysis

MPEG-7 based description schemes for multi-level video content classification

VIDEO CLASSIFICATION USING OBJECT TRACKING

Motion recovery for video content classification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Video Content Classification Research Articles

Related Topics

Articles published on Video Content Classification

A Short Video Classification Framework Based on Cross-Modal Fusion.

Chaser Priori Wolf (CPW) Optimization an Improved Optimization Technique Video Content Classification and Detection

Evaluation of the content of YouTubeTM videos about local anesthesia in pediatric dentistry

ENCVIDC: an innovative approach for encoded video content classification

An Automatic Classification Method of Sports Teaching Video Using Support Vector Machine

Research on Image-Based Movement Accuracy Monitoring of Aerobics

Video mining for facial action unit classification using statistical spatial–temporal feature image and LoG deep convolutional neural network

Approaches to Detect Online Radicalization: A Survey

Text-based video content classification for online video-sharing sites

An automatic video content classification scheme based on combined visual features model with modified DAGSVM

Text‐based video content classification for online video‐sharing sites

Video content classification based on 3-D Eigen analysis

Context and memory in multimedia content analysis

MPEG-7 based description schemes for multi-level video content classification

VIDEO CLASSIFICATION USING OBJECT TRACKING

Motion recovery for video content classification