Abstract—Human faces have been broadly studied in digital image and video processing fields. An appearance-based method, the adaptive boosting learning algorithm using integral image representations has been successfully employed for face detection, taking advantage of the feature extraction’s low computational complexity. In this paper, we propose a face-detection postprocessing method that equalizes instantaneous facial regions in an efficient hardware architecture for use in real-time multimedia applications. The proposed system requires low hardware resources and exhibits robust performance in terms of the movements, zooming, and classification of faces. A series of experimental results obtained using video sequences collected under dynamic conditions are discussed. Index Terms—Face detection, adaptive boosting algorithm, face-region stabilization, face recognition, single-port line memory I. I NTRODUCTION Human faces have been broadly studied in digital image and video processing fields. As one of the most detailed attributes in the human-computer interface, the face conveys a great deal of the information including behavioral cues, emotional state, identity, human race, age, and so on. Face-based algorithms thus have been used in a wide range of multimedia applications [1-3]. In particular, most digital imaging systems have integrated face-based auto-focusing functions to provide users with a convenient interface from the perspective of visual attention. In addition, the face-recognition systems require a correct face-detection scheme to extract facial features fed into pre-trained classifiers [4]. Detection and feature extraction are concurrently performed while searching human faces in digital images under dynamic visual deformation conditions such as position, scale, in-plane rotation, orientation, pose, and illumination. Depending upon the specific applications, robust face segmentation has been employed to solve several problems including the non-rigid shape of faces, clever alignment, and occlusion. In order to successfully achieve face detection in tiny mobile multimedia platforms, the hardware architecture must be computationally optimized in terms of memory usage and operation time. An appearance-based method, the adaptive boosting (AdaBoost) learning algorithm has been widely accepted in face detection using integral image representations. It is ideal for real-time hardware systems such as digital cameras and mobile phones [5] and requires low computational complexity for feature extraction. The AdaBoost employs local binary pattern (LBP) images to avoid the effects of illumination and to express detailed textures. The pyramid representation is assigned for multifarious sizes of human faces. To segment face regions according to the in-plane rotations of the human face, pre-defined feature factors are used. Although AdaBoost exhibits acceptable performance, instantaneous face detection in the spatial domain caused by cascaded structures limits its performance. For example, the pyramid representation is used to compare
Read full abstract