Abstract

[This paper has been withdrawn by the publisher]. A novel method to identify the violent videos only with audio features is introduced. Most previous content-based image or video classification schemes apply the bag of words (BOW) or bag of visual words (BOVW), which employ multiple visual features to characterize image or video content. In our method, the bag of audio words (BOAW) is suggested to be built by effective audio features. Two reasons are considered here. First, audio features should have very special significance for violent videos. Second, the computational complexity of dealing with audio features is much lower than that of visual features. The MPEG-7 low level features such as Audio Spectrum-Centroid and Audio Spectrum-Spread, and the high level feature such as Audio Signature, are combined into one 44-dimensions vector in the BOAW model. The audio words are built from the vector by the clustering strategy, and support vector machine (SVM) with revised soft-weighting scheme is used to group the audio words features into two classes, i.e. the violent and non-violent. Experiments demonstrate that the proposed method can achieve good recall accuracy and precision accuracy on detecting violent videos. The method also can be applied to classify other types of videos.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call