Abstract
This work presents an adapted version of the Computational Geometry Algorithm (CGA) used for the development of audio-based applications and services. The CGA algorithm analyses an audio stream and produces a unique set of points that can be considered to be the audio data “fingerprint”. It is shown that this fingerprint is coding-independent, a fact that can render the proposed algorithm suitable for multiple purposes, including the categorisation of content identity and the identification of audio clips, hence providing support for the realisation of audio sorting/searching tasks and services. Additionally, based on specific novel applications and services, the overall algorithmic performance and efficiency characteristics of the CGA algorithm are discussed and analysed.
Highlights
The use of digital technology for content distribution and reproduction introduces new domains and audio-embedded applications
The fingerprint produced for each sampled audio clip represents a significant parameter that may affect the realisation complexity, mainly in terms of computational load and memory requirements. Both the above requirements are met by the reduction of the spectral resolution of the original Fast Fourier Transform (FFT) magnitudes using the Onion Algorithm (OA), according to our latest studies (Gillespie, 2004; Kosch, 2004; Trifonova et al, 2008)
Further Research and Conclusions The research field addressed in this work is broad and interdisciplinary, incorporating aspects of computer science, but in our case diverse fields such as archival science, cognitive science, commerce, communications, law, library science and signal processing which are essentially interconnected
Summary
The use of digital technology for content distribution and reproduction introduces new domains and audio-embedded applications. Audio fingerprinting techniques aim to develop mechanisms for assessing the perceptual equivalence of different audio / audio content This is performed by providing a summary of the corresponding audio clip, which is typically stored in a database and serves as an index to the audio library metadata (Beekhof, Voloshynovskiy, Koval, & Holotyak, 2009). They derive PSD peaks and represent them using a scatter plot This approach imposes a time-alignment problem between the original audio and the comparison audio peaks in the “noisy” (distorted) case (Baluja & Covell, 2006; Chandrasekhar, Sharifi, & Ross, 2011), as it detects a significant cluster of points that form a diagonal line within the scatterplot in order to perform matching using as basis the density criterion (Wang, 2003).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.