Αναγνώριση και μοντελοποίηση της προσοχής και του ενδιαφέροντος του χρήστη σε περιβάλλοντα μη παρεμβατικής αλληλεπίδρασης

Στυλιανός Αστεριάδης

doi:10.12681/eadd/24423

Abstract

The research field, covered by this thesis is strongly related to recognition and modelling of human behaviour in Human-Computer-Interaction environments, using face analysis as input modality. Research focuses on those cases where no specific knowledge regarding the set up, or specialized equipment exists, apart from simple hardware, like a common web-camera. Normally, such systems are based on admissions regarding user position, camera parameters or specific hardware. Beyond the effort to avoid such admissions, one of the basic principles of this thesis was research on a series of components, not statically positioned in the architecture, but dynamically emphasized throughout each process. Each component architecture has been worked on independently, aimed at non intrusive environments, encouraging spontaneity in movements, as well as unpretending lighting conditions and background. More in detail, local techniques for facial feature tracking have been employed, as well as holistic techniques with the usage of Convolutional Neural Networks, and prototype inference architectures are proposed. Furthermore, this thesis also combines head rotation with eye gaze directionality estimation. For estimating eye gaze, head rotation effect is virtually cancelled by employing 3D geometrical models and iris positions are compared to reference topologies. One of the large challenges towards these directions, on a second level, was modelling of extracted facial data, in order to train systems that would try to imitate human perception in terms of engagement in human-computer interaction scenarios. To this aim, fuzzy logic was used. Furthermore, modeling was used as knowledge in order to optimize hybrid methodologies of head pose estimation, by training Bayesian Modality Fusion Networks. Theoretical conclusions are grounded on demanding datasets, with a degree of fuzziness that, many times, confused even human annotators. However, results highlighted the prospect of employing non intrusive mechanisms for inferring engagement based on non verbal communication, using face analysis, in a plethora of applications related to affective computing.

Full Text