Abstract
This paper presents the Athens Information Technology system for 3D person tracking and the obtained results in the CLEAR 2007 evaluations. The system utilizes audiovisual information from multiple acoustic and video sensors. The proposed system comprises a video and an audio subsystem whose results are suitably combined to track the last active speaker. The video subsystem combines in 3D a number of 2D face localization systems, aiming at tracking all people present in a room. The audio subsystem uses an information theoretic metric upon an ensemble of microphones to estimate the active speaker.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.