Abstract
Block based video modeling is a hot issue of video information processing. In past literature, the size of block is set to a fixed value. Different with previous works, we find that the optimal size of primitives depend on the video content rather than fixed value. In this paper, in order to model natural video sequence, we segment video sequence to a number of spatial-temporal neighborhoods, and categorize video neighborhoods into two types: structural video primitives and textural video primitives. Structural video primitives represent structural pixels and their movement and textural video primitives represent the texture neighborhoods and their movement. We learn the size of video primitives based on genetic algorithm and spatial-temporal neighborhoods entropy. Then we map spatial-temporal neighborhoods to primitives using the structural similarity index. The experimental results demonstrate that the size of primitives depends on the content of the video rather than a fixed value. Using our method, the structural video primitives and textural video primitives are separated better than using fixed size, and the computational time for learning primitives has been greatly reduced. The primitives we learned can be used to video reconstruction, video segmentation and other applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.