Multilayer Architectures for Facial Action Unit Recognition.

Tingfan Wu Tingfan Wu,N. J. Butko,J. Whitehill,J. R. Movellan,M. S. Bartlett,P. Ruvolo

doi:10.1109/tsmcb.2012.2195170

Abstract

In expression recognition and many other computer vision applications, the recognition performance is greatly improved by adding a layer of nonlinear texture filters between the raw input pixels and the classifier. The function of this layer is typically known as feature extraction. Popular filter types for this layer are Gabor energy filters (GEFs) and local binary patterns (LBPs). Recent work [1] suggests that adding a second layer of nonlinear filters on top of the first layer may be beneficial. However, it is unclear what is the best architecture of layers and selection of filters. In this paper, we present a thorough empirical analysis of the performance of single-layer and dual-layer texture-based approaches for action unit recognition. For the single hidden layer case, GEFs perform consistently better than LBPs, which may be due to their robustness to jitter and illumination noise as well as to their ability to encode texture at multiple resolutions. For dual-layer case, we confirm that, while small, the benefit of adding this second layer is reliable and consistent across data sets. Interestingly for this second layer, LBPs appear to perform better than GEFs.

Full Text