Abstract

AbstractSketch face recognition has a wide range of applications in criminal investigation, but it remains a challenging task due to the small‐scale sample and the semantic deficiencies caused by cross‐modality differences. The authors propose a light semantic Transformer network to extract and model the semantic information of cross‐modality images. First, the authors employ a meta‐learning training strategy to obtain task‐related training samples to solve the small sample problem. Then to solve the contradiction between the high complexity of the Transformer and the small sample problem of sketch face recognition, the authors build the light semantic transformer network by proposing a hierarchical group linear transformation and introducing parameter sharing, which can extract highly discriminative semantic features on small–scale datasets. Finally, the authors propose a domain‐adaptive focal loss to reduce the cross‐modality differences between sketches and photos and improve the training effect of the light semantic Transformer network. Extensive experiments have shown that the features extracted by the proposed method have significant discriminative effects. The authors’ method improves the recognition rate by 7.6% on the UoM‐SGFSv2 dataset, and the recognition rate reaches 92.59% on the CUFSF dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call