Abstract

Videos are considered the new era communication language between internet users due to the explosion of smart-phones usage and increase in internet bandwidth and storage space. This has fueled the need to develop robust video analysis techniques. Specifically, video classification presents a unique task for field researchers, as it has numerous critical applications, such as video indexing, searching, annotation and surveillance. Videos inherently embody static and dynamic information that is encoded in frames. The task is further prioritized due to the gigantic amounts of available videos in the digital world, which requires a robust way to organize these videos. Throughout literature, researchers have generally adopted three main techniques to classify videos, i.e., direct features matching, machine learning-based methods, and deep learning-based methods. Each of these methods is suitable for a specific application type. This paper is designed to assess which of the three common working approaches are better for video classification. Furthermore, the paper aims to examine whether and how these methods affect/improve video classification performance and key factors to constructing a robust video classification system. This novel research paper covers an important research gap by introducing a rigorous comparative analysis of the three methods highlighting their advantages and disadvantages and guiding field researchers. A comprehensive analysis brings the paper findings together using a benchmark group of challenging large-scale video datasets (~29k videos). This would provide field researchers with the necessary information to choose the best method for their video classification research work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call