Abstract

Binaural rendering is a technique that seeks to generate virtual auditory environments that replicate the natural listening experience, including the three-dimensional perception of spatialized sound sources. As such, real-time knowledge of the listener's position, or more specifically, their head and ear orientations allow the transfer of movement from the real world to virtual spaces, which consequently enables a richer immersion and interaction with the virtual scene. This study presents the use of a simple laptop integrated camera (webcam) as a head tracker sensor, disregarding the necessity to mount any hardware to the listener's head. The software was built on top of a state-of-the-art face landmark detection model, from Google's MediaPipe library for Python. Manipulations to the coordinate system are performed, in order to translate the origin from the camera to the center of the subject's head and adequately extract rotation matrices and Euler angles. Low-latency communication is enabled via User Datagram Protocol (UDP), allowing the head tracker to run in parallel and asynchronous with the main application. Empirical experiments have demonstrated reasonable accuracy and quick response, indicating suitability to real-time applications that do not necessarily require methodical precision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.