Off-Screen Sound Separation Based on Audio-visual Pre-training Using Binaural Audio.

Masaki Yoshida,Takahiro Ogawa,Miki Haseyama,Ren Togo

doi:10.3390/s23094540

Abstract

This study proposes a novel off-screen sound separation method based on audio-visual pre-training. In the field of audio-visual analysis, researchers have leveraged visual information for audio manipulation tasks, such as sound source separation. Although such audio manipulation tasks are based on correspondences between audio and video, these correspondences are not always established. Specifically, sounds coming from outside a screen have no audio-visual correspondences and thus interfere with conventional audio-visual learning. The proposed method separates such off-screen sounds based on their arrival directions using binaural audio, which provides us with three-dimensional sensation. Furthermore, we propose a new pre-training method that can consider the off-screen space and use the obtained representation to improve off-screen sound separation. Consequently, the proposed method can separate off-screen sounds irrespective of the direction from which they arrive. We conducted our evaluation using generated video data to circumvent the problem of difficulty in collecting ground truth for off-screen sounds. We confirmed the effectiveness of our methods through off-screen sound detection and separation tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: May 7, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Off-Screen Sound Separation Based on Audio-visual Pre-training Using Binaural Audio.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net
Yui Sudo ... Kazuhiro Nakadai
-
Yui Sudo, et. al.Yui Sudo ... Kazuhiro Nakadai
11 Jan 2021
11 Jan 2021

Multichannel environmental sound segmentation
Yui Sudo ... Kenji Nishida
Applied Intelligence | VOL. 51
Yui Sudo, et. al.Yui Sudo ... Kenji Nishida
30 Mar 2021
Applied Intelligence | VOL. 51

Application of sound source separation methods to advanced spatial audio systems.
Máximo Cobos Serrano
-
Máximo Cobos SerranoMáximo Cobos Serrano
03 Dec 2010
03 Dec 2010

Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal Attention
Kranti Kumar Parida ... Gaurav Sharma
-
Kranti Kumar Parida, et. al.Kranti Kumar Parida ... Gaurav Sharma
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Off-Screen Sound Separation Based on Audio-visual Pre-training Using Binaural Audio.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)