Abstract
Individual head-related transfer functions (HRTFs) are critical for binaural spatial audio rendering. In contrast to anthropometric parameters and pinnae images, 3D meshes allow for a more direct and comprehensive representation of the anthropometric structure, which provides highly effective inputs for modeling individualized HRTFs. This paper presents a neural network-based method for predicting individualized HRTFs in full space based on 3D meshes. Unlike many previous methods that estimate HRTF spectra at sampling grids or frequencies separately, the proposed model predicts the HRTF spectra of each vertical plane by considering the spectral correlation and continuity across adjacent sampling grids and frequencies. Evaluation results indicate that the proposed method enhances the prominence of peaks and notches in the obtained HRTF spectra and improves the speed and accuracy of HRTF individualization. The log spectral distortion of the proposed method is lower than that of state-of-the-art methods using anthropometric parameters and pinnae images. Further evaluation confirms that the proposed method requires significantly fewer points in 3D meshes when compared to numerical simulation methods. The evaluation based on localization models demonstrates that the HRTFs predicted by the proposed method are perceptually similar to the measured HRTFs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.