In this study, we focused on the formalization of video frame descriptions in the context of solving video segmentation problem. Since native video data can have various sizes, dividing each frame into blocks allows present image frame as a square matrix for a formal description. The frame block is a matrix of arbitrary dimensions. The ability to skip the step of matrix transformation to a square dimension or vectorization using some descriptor allows to reduce computational costs, freeing up computational resources required for this transformation. In our study, we used Ky Fan norm value as image frame block descriptor. The Ky Fan norm is built on top of matrix singular values. A singular decomposition does not impose restrictions on either the dimension orthe character of the elements of the original matrix. We conducted a comparative analysis of the effectiveness of the obtained descriptor for different video data sizesand with different aspect ratios, showing that the change in the descriptor for each block is independent of the video sizeand aspect ratios. Changes in the descriptorsfor each block from frame to frame are identical for video data of varying sizes. This means that as a result of such fragment transform, a square matrix of a fixed size iscreated, regardless of the output video size.This makes it possible to unify further processing of the video, which can be useful for the task of information search in large video databases under the conditions of providing a query "ad exemplum". In thiscase, we can analyze the existing database in offline mode and match each video with a fixed square matrix of descriptors, which will significantly reduce the time and amount of resources when matching with the query. Also, this approach can be effectively used to analyze video data for the motion detection and scene change tracking.
Read full abstract