Abstract

This paper presents a cross-age facial recognition model that integrates Convolutional Neural Networks (CNN) with Transformers. The model first utilizes a depth-separable T2T-ViT network to extract rich facial features. Subsequently, it employs a multi-scale attention decomposition module to nonlinearly decouple age and identity features. The feature decomposition is jointly constrained by mutual information minimization, cross-entropy, and the Arcface function. The model achieves accuracy rates of 94.97%, 99.51%, and 95.81% on three benchmark datasets: FG-NET, CACD_VS, and CALFW, respectively, matching or surpassing the state-of-the-art (SOTA) performance. These results indicate that the proposed model can extract robust facial information and efficiently decouple features, achieving advanced recognition performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.