Audio Fingerprinting Techniques Research Articles

Audio fingerprinting techniques were developed to index and retrieve audio samples by comparing a content-based compact signature of the audio instead of the entire audio sample, thereby reducing memory and computational expense. Different techniques have been applied to create audio fingerprints; however, with the introduction of deep learning, new data-driven unsupervised approaches are available. This article presents Sequence-to-Sequence Autoencoder Model for Audio Fingerprinting (SAMAF), which improved hash generation through a novel loss function composed of terms: Mean Square Error, minimizing the reconstruction error; Hash Loss, minimizing the distance between similar hashes and encouraging clustering; and Bitwise Entropy Loss, minimizing the variation inside the clusters. The performance of the model was assessed with a subset of VoxCeleb1 dataset, a“speech in-the-wild” dataset. Furthermore, the model was compared against three baselines: Dejavu, a Shazam-like algorithm; Robust Audio Fingerprinting System (RAFS), a Bit Error Rate (BER) methodology robust to time-frequency distortions and coding/decoding transformations; and Panako, a constellation-based algorithm adding time-frequency distortion resilience. Extensive empirical evidence showed that our approach outperformed all the baselines in the audio identification task and other classification tasks related to the attributes of the audio signal with an economical hash size of either 128 or 256 bits for one second of audio.

Read full abstract

오디오 핑거프린팅 기술은 잡음과 에코 등으로 인한 왜곡에도 성공적으로 음원을 식별해야한다. 이러한 오디오 핑거프린팅 기술을 TV광고식별에 적용하고자 한다. 본 논문은 TV 광고 식별을 위한 강인한 오디오 핑거프린팅 방식을 제안한다. 제안된 방법에서 사용되는 Constant Q 변환 기반에서 추출된 현저한 오디오 피크 쌍 핑거프린트는 실제 다양한 잡음환경에서 오디오 핑거프린팅 시스템의 정확도를 향상시키고, 낮은 복잡도를 가진다. 실험결과는 제안된 방식이 기존의 오디오 핑거프린팅 방식에 비해 다양한 잡음환경에서도 안정적이며 신뢰할 수 있는 검색 정확도를 제공함을 보여준다. In spite of distortion caused by noise and echo, the audio fingerprinting technique must identify successfully an audio source. This audio fingerprinting technique is applying for TV commercial advertisement identification. In this paper, we propose a robust audio fingerprinting method for TV commercial advertisement identification. In the proposed method, a prominent audio peak pair fingerprint based on constant Q transform improves the accuracy of the audio fingerprinting system in real noisy environments. Experimental results confirm that the proposed method is quite robust than previous audio fingerprinting method in different noise conditions and achieves promising accurate results.

Read full abstract

Audio Fingerprinting Techniques Research Articles

Related Topics

Articles published on Audio Fingerprinting Techniques

Auditory Image Perception and Verbal Brain Subsystems

SAMAF

Key‐dependent audio fingerprinting technique based on a quantisation minimum‐distance hash extractor in the DWT domain

Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review.

A low-complexity audio fingerprinting technique for embedded applications

Synchronizing multimodal recordings using audio-to-audio alignment

Audio Fingerprinting Based on Constant Q Transform for TV Commercial Advertisement Identification

Automatic highlights extraction for drama video using music emotion and human face features

An effective and efficient indexing scheme for audio fingerprinting

A PUBLIC AUDIO IDENTIFICATION EVALUATION FRAMEWORK FOR BROADCAST MONITORING

Audio Fingerprinting Based on Multiple Hashing in DCT Domain

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Audio Fingerprinting Techniques Research Articles

Related Topics

Articles published on Audio Fingerprinting Techniques

Auditory Image Perception and Verbal Brain Subsystems

SAMAF

Key‐dependent audio fingerprinting technique based on a quantisation minimum‐distance hash extractor in the DWT domain

Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review.

A low-complexity audio fingerprinting technique for embedded applications

Synchronizing multimodal recordings using audio-to-audio alignment

Audio Fingerprinting Based on Constant Q Transform for TV Commercial Advertisement Identification

Automatic highlights extraction for drama video using music emotion and human face features

An effective and efficient indexing scheme for audio fingerprinting

A PUBLIC AUDIO IDENTIFICATION EVALUATION FRAMEWORK FOR BROADCAST MONITORING

Audio Fingerprinting Based on Multiple Hashing in DCT Domain