Abstract

This paper presents the Athens Information Technology system for 3D person tracking and the obtained results in the CLEAR 2007 evaluations. The system utilizes audiovisual information from multiple acoustic and video sensors. The proposed system comprises a video and an audio subsystem whose results are suitably combined to track the last active speaker. The video subsystem combines in 3D a number of 2D face localization systems, aiming at tracking all people present in a room. The audio subsystem uses an information theoretic metric upon an ensemble of microphones to estimate the active speaker.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.