Development of a deep multimodal hedonic recognition database for oral stimuli

Ruicong Zhi,Chenyang Wang,Xin Hu,Caixia Zhou,Mengyi Liu,Jingru Zhao,Yiping Zuo

doi:10.1016/j.foodqual.2020.104061

Abstract

The emotional reactions that certain products evoke (like or dislike) play a key role in the user acceptance or preference toward these products and influence the choice behaviors. Facial expressions constitute a primary channel to express and communicate emotions. In particular, the facial reactions evoked by certain stimuli are generated subconsciously, and they can be easily captured by video recording. However, it is challenging to quantify and objectively describe facial expressions. Until now, most implicit measurements based on facial expression analyses have been implemented considering “basic emotions”, which cannot directly represent like/dislike. How can we identify the affective state of consumers directly by using facial expression analysis technology? In this study, hedonic rating categories are defined based on the hedonic scale, and automatic facial expression analysis technology is applied to evaluate the suitability of the defined categories. Moreover, a visual multimodal facial database (RGB, depth, and infrared images) for the hedonic recognition of Asian individuals is constructed. A hybrid deep learning framework is employed to exploit the spatiotemporal information of the collected data to realize hedonic recognition. Early and late fusion strategies are used to investigate the suitability of the individual modalities and their complementarity. The experimental results indicate that the late fusion of the three modalities yielded the highest recognition accuracy compared to those obtained using individual modalities and the early fusion scheme. These results demonstrate the effectiveness of fusing different visual facial data and indicate that the use of multimodality-based facial expression analyses could enhance the accuracy of hedonic recognition.

Full Text