Speech/music discrimination using hybrid-based feature extraction for audio data indexing

Kun-Ching Wang,Ying-Ru Yang,Yung-Ming Yang

doi:10.1109/icsse.2017.8030927

Abstract

In this paper, we present a speech/music discrimination (SMD) using hybrid manner of feature extraction to discriminate the noisy audio signal into speech and music. The hybrid-based SMD performs the combination of 1D signal processing and 2D image processing to extract multiple features. In general, the noisy audio segment can be regarded as music, speech or noise (silence). The proposed hybrid-based SMD approach has been successfully applied into audio data indexing to classify the noisy audio signal into speech, music and noise. The approach includes three main stages: pre-processing/voice activity detection (VAD), speech/music discrimination (SMD) and rule-based post-processing. Both of pre-processing and VAD are regarded as the first stage for discriminating audio recording stream into noise-only segments and noisy audio segments. Next, the hybrid-based SMD is regarded as the second stage to classify noisy audio segments into speech segments and music segments. In third stage, a rule-based post-filtering method will be applied in order to improve the discrimination accuracy and to reflect the continuity of audio data in time. Experimental results will show that the proposed hybrid-based SMD approach can successfully apply into the audio data indexing. The overall system accuracy will be evaluated on radio recordings from various sources. Performance results can provide significant classification for the envisaged tasks compared to existing methods is given.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech/music discrimination using hybrid-based feature extraction for audio data indexing

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD.
Kun-Ching Wang
Entropy | VOL. 22
Kun-Ching WangKun-Ching Wang
06 Feb 2020
Entropy | VOL. 22

An evolutionary approach for segmentation of noisy speech signals for efficient voice activity detection
Farook Sattar ... Moe Pwint
Artificial Intelligence Research | VOL. 5
Farook Sattar, et. al.Farook Sattar ... Moe Pwint
26 Oct 2015
Artificial Intelligence Research | VOL. 5

Audio data indexing: Use of second-order statistics for speaker-based segmentation
P Delacourt ... C Wellekens
-
P Delacourt, et. al.P Delacourt ... C Wellekens
07 Jun 1999
07 Jun 1999

Automatic scene change detection for composed speech and music sound under low SNR noisy environment
Chaug-Ching Huang ... Dian-Jia Wu
IEEE Transactions on Speech and Audio Processing | VOL. 13
Chaug-Ching Huang, et. al. Chaug-Ching Huang ... Dian-Jia Wu
01 Sep 2005
IEEE Transactions on Speech and Audio Processing | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech/music discrimination using hybrid-based feature extraction for audio data indexing

Abstract

Talk to us

Similar Papers