Abstract

As an interesting, meaningful, and challenging topic, video content analysis is to find meaningful structure and patterns from visual data for the purpose of efficient indexing and mining of videos. In this thesis, a new theoretical framework on video content analysis using the video time density function (VTDF) and statistical models is proposed. The proposed framework tries to tackle the problems in video content analysis based on its semantic information from three perspectives: video summarization, video similarity measure, and video event detection. In particular, the main research problems are formulated mathematically first. Two kinds of video data modeling tools are then presented to explore the spatiotemporal characteristics of video data, the independent component analysis (ICA)-based feature extraction and the VTDF. Video summarization is categorized into two types: static and dynamic. Two new methods are proposed to generate the static video summary. One is hierarchical key frame tree to summarize video content hierarchically. Another is vector quantization-based method using Gaussian mixture (GM) and ICA mixture (ICAM) to explore the characteristics of video data in the spatial domain to generate a compact video summary. The VTDF is then applied to develop several approaches for content-based video analysis. In particular, VTDF-based temporal quantization and statistical models are proposed to summarize video content dynamically. VTDF-based video similarity measure model is to measure the similarity between two video sequences. VTDF-based video event detection method is to classify a video into pre-defined events. Video players with content-based fast-forward playback support are designed, developed, and implemented to demonstrate the feasibility of the proposed methods. Given the richness of literature in effective and efficient information coding and representation using probability density function (PDF), the VTDF is expected to serve as a foundation of video content representation and more video content analysis methods will be developed based on the VTDF framework.

Highlights

  • With the rapid technology advances in digital TV, multimedia, and Internet, we have seen an amazing increase in the amount of digital image, audio, and video data in a very short time period

  • Video 3 is used to show that GM vector quantization (GMVQ) can remove the redundancy of video summary obtained by SVS method

  • Video 4 is a movie video with long time duration to prove the capability of GMVQ to summarize the long duration video and generate a compact video summary

Read more

Summary

Introduction

With the rapid technology advances in digital TV, multimedia, and Internet, we have seen an amazing increase in the amount of digital image, audio, and video data in a very short time period. Thanks to the increasing availability of computing resources and the popularity of the Web 2.0 related technologies, we have witnessed a growing number of user-centric applications to allow ordinary people to record, edit, deliver, and publish their own home-made digital videos on social web or networks (e.g., YouTube). Interaction with videos has become an important part of our life and many related applications have emerged. They include video on demand, digital video libraries, distance learning, and surveillance systems [55]. Microsoft launched Kinect for its game console, Xbox360 last year [4]. It uses video cameras for motion detection, skeletal tracking, and facial recognition. After capturing the input video and complex computing, it responds to simulate users’ body actions effectively

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call