Abstract

Low cost time-of-flight based depth sensors such as Kinect have opened new avenues for their usage in video surveillance scenarios. RGB-D images obtained from such sensors have shown their utility in improved face recognition capabilities. Generally, existing RGB-D face recognition methods fuse the depth information with RGB information which results in enhanced recognition performance. However, in the real world surveillance scenarios, cameras are placed at a distance too large for low cost depth sensors to capture good quality depth information. Such poor quality depth information may not contribute significantly to face recognition. In this paper, we present a novel representation learning algorithm by learning shared representation of RGB and depth information using a reconstruction based deep neural network. The proposed network, once trained in the offline mode, can generate a shared representation of RGB and depth data using only the RGB image. This feature rich representation is then utilized for face identification. This allows the framework to be used in scenarios where low quality or no depth image is captured. Experiments on multiple real world databases show the effectiveness of the proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call