Abstract
The layered dynamic texture (LDT) is a generative model, which represents video as a collection of stochastic layers of different appearance and dynamics. Each layer is modeled as a temporal texture sampled from a different linear dynamical system, with regions of the video assigned to a layer using a Markov random field. Model parameters are learned from training video using the EM algorithm. However, exact inference for the E-step is intractable. In this paper, we propose a variational approximation for the LDT that enables efficient learning of the model. We also propose a temporally-switching LDT (TS-LDT), which allows the layer shape to change over time, along with the associated EM algorithm and variational approximation. The ability of the LDT to segment video into layers of coherent appearance and dynamics is also extensively evaluated, on both synthetic and natural video. These experiments show that the model possesses an ability to group regions of globally homogeneous, but locally heterogeneous, stochastic dynamics currently unparalleled in the literature.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.