A Deep Learning approach for the Generation of Room Impulse Responses

Daniel A Sanaguano-Moreno,José F Lucio-Naranjo,Luis Bravo-Moncayo,Roberto A Tenenbaum,Gabriel B Regattiere-Sampaio

doi:10.1109/ici2st57350.2022.00017

Abstract

Generating high-resolution binaural room impulse responses (BRIRs) for a given position requires significant computational resources, making it impractical for acoustic virtual reality in real-time environments. The main problem is the large number of coefficients required to simulate high-quality BRIRs, even in low-reverberant environments. In this sense, an approach that reconstructs a BRIR with fewer computational resources has been a challenge until nowadays. Therefore, this work introduces an approach for generating BRIRs from a compressed BRIR representation using a variational autoencoder (VAE). The approach consists of 1) simulating a certain number of BRIRs using an acoustic simulator software for a given virtual environment; 2) generating a dataset with enough BRIRs distributed uniformly over the given scenario using a data augmentation process; 3) applying a clusterization technique to homogenize the BRIR dataset previous training process; 4) training a VAE to obtain dimensionally reduced BRIRs from the encoder; 5) finally, generating a BRIR from the compressed representation using the decoder of the VAE. Only a segment of the BRIR was used to train and evaluate this approach. The MSE between the BRIRs simulated and the BRIRs generated by the model were compared to test the results, presenting a lower MSE.

Full Text