Abstract
In this paper, we propose a global personalized head-related transfer function (HRTF) method based on anthropometric measurements and ear images. The model consists of two sub-networks. The first is the VGG-Ear Model, which extracts features from the ear images. The second sub-network uses anthropometric measurements, ear features, and frequency information to predict the spherical harmonic (SH) coefficients. Finally, the personalized HRTF is obtained through inverse spherical harmonic transform (SHT) reconstruction. With only one training, the HRTF in all directions can be obtained, which greatly reduces the parameters and training cost of the model. To objectively evaluate the proposed method, we calculate the spectral distance (SD) between the predicted HRTF and the actual HRTF. The results show that the SD provided by this method is 5.31 dB, which is better than the average HRTF of 7.61 dB. In particular, the SD value is only increased by 0.09 dB compared to directly using the pinna measurements.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.