Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarity within Polyphonic Audio

Jonathan Doherty,Paul Mckevitt,Kevin Curran

doi:10.12928/telkomnika.v15i1.4581

Jonathan Doherty, Paul Mckevitt + Show 1 more

Open Access

https://doi.org/10.12928/telkomnika.v15i1.4581

Copy DOI

Abstract

One method overlooked to date, which can work alongside existing audio compression schemes, is that which takes account of the semantics and natural repetition of music through meta-data tagging. Similarity detection within polyphonic audio has presented problematic challenges within the field of Music Information Retrieval. This paper presents a method (SoFI) for improving the quality of stored audio being broadcast over any wireless medium through meta-data which has a number of market applications all with market value. Our system works at the content level thus rendering it applicable in existing streaming services. Using the MPEG-7 Audio Spectrum Envelope (ASE) gives features for extraction and combined with k-means clustering enables self-similarity to be performed within polyphonic audio. SoFI uses string matching to identify similarity between large sections of clustered audio. Objective evaluations of SoFI give positive results which show that SoFI is shown to detect high levels of similarity on varying lengths of time within an audio file. In a scale between 0 and 1 with 0 the best, a clear correlation between similarly identified sections of 0.2491 shows successful identification.

Full Text