Abstract

The middle-term goal of this research project is to be able to recover several sound sources from a binaural life recording, by previously measuring the acoustic response of the room. As a previous step, this paper focuses on the reconstruction of n sources from mconvolutive mixtures when m < n (underdetermined case), assuming the mixing matrix is known. The reconstruction is done in the frequency domain by assuming that the source components are Laplacian in their real and imaginary parts. By posterior likelihood optimization, this leads to norm 1 minimization subject to the mixing equations, which is an instance of linear programming (LP). Alternatively, the assumption of Laplacianity imposed on the magnitudes leads to second order cone programming (SOCP). Performance experiments are run from synthetic mixtures based on realistic simulations of each source-microphone impulse response. Two sets of sources are used as benchmarks: four speech utterances and six short violin melodies. Results show S/N reconstruction ratios around 10dB. If any, SOCP performs slightly better. SOCP is probably too slow for real-time processing. In the last part of this paper we train a neural network to predict the response of the optimizer. Preliminary results show that the approach is feasible but yet inmature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call