Abstract

The discrimination between various types of speech and non-speech signals in audio data stream is the fundamental step for further indexing and retrieving. This paper considers some of the basic problems in audio content classification which is the key component in automatic audio retrieval system. It illustrates a potential use of statistical learning algorithm called support vector machine (SVM) for broadcast news (BN) audio classification task. The overall classification architecture uses binary tree SVM (BT-SVM) decision scheme in combination with well known audio features such as, MFCCs and low level MPEG-7 audio descriptors. The important step in creating such classification system is to define the optimal features for each binary SVM classifier. There exist various feature selection algorithms that help to create such feature set. Therefore we decided to implement F-score and Minimum Redundancy Maximum Relevance (MRMR) feature selection algorithms, as an effective search algorithms used in many pattern recognition tasks.

Highlights

  • Growing number of audio databases with vast amount of audio data demands for efficient organization and manipulation of this data

  • This paper presents possible solution for audio stream classification, utilizing binary tree discrimination technique based on support vector machine (SVM) classifier and two effective feature selection algorithms, used for processing and retrieving of broadcast news (BN) audio data

  • All the evaluations within the training phase were based on the assumption that the SVM needs only small set of data in order to preserve generalization ability and to avoid the problem of overfitting

Read more

Summary

Introduction

Growing number of audio databases with vast amount of audio data demands for efficient organization and manipulation of this data. Such processing is desirable for applications requiring accurate discrimination of speech and non-speech segments, for instance automatic transcription of broadcast news (BN), speech and speaker recognition, retrieving of audio queries, and so forth. Fundamental step in audio stream processing is to automatically classify audio content into appropriate audio classes. We call this separation criterion as audio content classification. Process of classification is often carried out along with the process of audio stream segmentation. The overall classification performance is conditioned by the process of feature extraction

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call