Abstract

Gender identity is one of the most fundamental aspects of life. Automatic gender identification is increasingly being used in areas such as security, marketing, and social robots. The objective of this paper is to address the challenges of gender and age identification in very crowded/noisy environments where faces are unclear and/or people are moving in relatively random directions. It presents an end-to-end real-time intelligent video analytics solution for instant people counting, gender and age estimation in crowded and open environments. The proposed solution includes a complete pipeline for training vision deep learning models and deploying them to edge devices connected to a distributed streaming analytics server. Our final Deep Learning architecture is an extended version of FairMOT, a multi-object tracking model, with two additional layers for multi-class gender classification and age regression. The training phase is performed using an enhanced and enriched version of the CrowdHuman dataset, a public dataset for human detection, with gender and age annotations added. The overall system has been validated for various movies and has shown state-of-the-art performance in terms of people tracking, gender and age inference. Our code, models, and data can be found at https://github.com/jasseur2017/people_gender_age.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call