Abstract
The 2015 FRVT gender classification (GC) report evidences the problems that current approaches tackle in situations with large variations in pose, illumination, background and facial expression. The report suggests that both commercial and research solutions are hardly able to reach an accuracy over 90 % for The Images of Groups dataset, a proven scenario exhibiting unrestricted or in the wild conditions. In this paper, we focus on this challenging dataset, stepping forward in GC performance by observing: 1) recent literature results combining multiple local descriptors, and 2) the psychophysics evidences of the greater importance of the ocular and mouth areas to solve this task. We therefore make use of holistic and inner facial patches to extract features, that are later combined via a score level fusion strategy. The achieved results support the main information provided by the ocular and the mouth areas. Indeed, the combination of multiscale extracted features increases the overall accuracy to over 94 %, reducing notoriously the classification error if compared with tuned holistic and deep learning approaches.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.