Speech, Music and Multifractality

Susmita Bhaduri,Dipak Ghosh

doi:10.18520/cs/v110/i9/1817-1822

Abstract

Audio signal categorization is one of the rudimentary steps in applications like content-based audio information retrieval, audio indexing, speaker identification, and so on. In this work, a rigorous, non-stationary methodology capable of categorization among speech and various music signals is proposed. Multifractal detrended fluctuation analysis method is used to analyse the internal dynamics of the acoustics of digitized audio signal. The test data include speech (nonmusical), drone (periodically musical) and music samples of Rāgas (having different musicality) from Indian classical music (INDIC). It is found that the degree of complexity and multifractality (measured by width of the multifractal spectrum) changes from the start towards the end of each audio sample. However, the range of this variation is the smallest for speech and drone. The normalized value of the width of the multifractal spectrum is strikingly different for speech and drone. Experimental results show that this parameter can effectively classify speech and drone signals. Further, we have experimented with a number of clips of INDIC Rāgas with a range of variation in musicality and mood content. The results show that the width of the multifractal spectrum of the signals can categorize different music signals. In contrast with the conventional stationary techniques for audio signal analysis, we have used the method of complexity analysis without converting the non-stationary audio signals in frequency domain. We have used basic waveforms of the audio signals after de-noising them.

Full Text