Content-based classification and retrieval of audio

Tong Zhang,C.-C Jay Kuo

doi:10.1117/12.325703

Tong Zhang, C.-C Jay Kuo

https://doi.org/10.1117/12.325703

Copy DOI

Export

Save

Cite

Publication Date: Oct 2, 1998

Citations: 52

Affiliation: University of Southern California

Abstract
Full-Text
Similar Papers

Abstract

Listen

An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.

Full Text