Abstract

Still-to-video (S2V) face recognition has recently attracted attention from researchers because of its great applications in real-world scenarios. In S2V FR, still images are usually of high quality, captured from cooperative users under controlled environment, such as mugshots, while video clips may be acquired with low resolutions and low quality, from non-cooperative users under uncontrolled environment. Because of those significant differences, we interpret the S2V FR as a heterogeneous matching problem, and propose an approach aiming at building multiple “bridges” between those two heterogeneous face modalities. Considering the unbalanced distributions and large diversities between two modalities, we propose to exploit a Grassmann manifold learning method to construct subspaces in between to find connections (or transitions) between the still images and video clips. Multiple geodesic flows are generated connecting the subspace of still images and the clustered subspace centers of videos, which are representative and robust to characterize the relationship between still images and video frames. Extensive experiments are conducted on two large scale benchmark databases, COX-S2V and PaSC, with different recognition tasks: face identification and verification. The experimental results show that the proposed approach outperforms the state-of-the-art methods under the same experimental settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call