Abstract

In recent years, with the growth of digital media and modern imaging equipment, the use of video processing algorithms and semantic film and image management has expanded. The usage of different video datasets in training artificial intelligence algorithms is also rapidly expanding in various fields. Due to the high volume of information in a video, its processing is still expensive for most hardware systems, mainly in terms of its required runtime and memory. Hence, the optimal selection of keyframes to minimize redundant information in video processing systems has become noteworthy in facilitating this problem. Eliminating some frames can simultaneously reduce the required computational load, hardware cost, memory and processing time of intelligent video-based systems. Based on the aforementioned reasons, this research proposes a method for selecting keyframes and adaptive cropping input video for human action recognition (HAR) systems. The proposed method combines edge detection, simple difference, adaptive thresholding and 1D and 2D average filter algorithms in a hierarchical method. Some HAR methods are trained with videos processed by the proposed method to assess its efficiency. The results demonstrate that the application of the proposed method increases the accuracy of the HAR system by up to 3% compared to random image selection and cropping methods. Additionally, for most cases, the proposed method reduces the training time of the used machine learning algorithm.

Highlights

  • The use of video and digital content has expanded due to smartphones and other available imaging equipment

  • The key goal of the proposed method is the pre-processing of input videos in order to remove unnecessary information at the beginning of the human action recognition (HAR) system, so this section is mainly focused on keyframe selection and region of interest (ROI) finding at the input block of HAR systems

  • This research proposed a method for selecting the keyframes and suitable regions in a video to increase the speed and accuracy of HAR systems

Read more

Summary

Introduction

The use of video and digital content has expanded due to smartphones and other available imaging equipment. In applications such as selecting keyframes to create a short film as a movie trailer, the goal is to assure an appealing trailer firm in order to prompt people to spend money, go to the cinema and watch the entire movie In applications such as those related to machine vision systems, which use various videos for training, selecting a fixed length of a video is necessary to create a better training sample and to reduce the required training time based on relevant details of the system input [2]. Common video-based HAR systems use methods, such as resizing or cropping frames, to match the different acquisition camera resolutions, which may reduce the system efficiency.

Literature Review
Video Shortening
Adaptive Frame Cropping
Result and Discussion
Method
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.