Abstract
Hand detection is a crucial pre-processing procedure for many human hand related computer vision tasks, such as hand pose estimation, hand gesture recognition, human activity analysis, and so on. However, reliably detecting multiple hands from cluttering scenes remains to be a challenging task because of complex appearance diversities of dexterous human hands (e.g., different hand shapes, skin colors, illuminations, orientations, and scales, etc.) in color images. To tackle this problem, an accurate hand detection method is proposed to reliably detect multiple hands from a single color image using a hybrid detection/reconstruction convolutional neural networks (CNN) framework, in which regions of hands are detected and appearances of hands are reconstructed in parallel by sharing features extracted from a region proposal layer, and the proposed model is trained in an end-to-end manner. Furthermore, it is observed that the generative adversarial network (GAN) could further boost the detection performance by generating more realistic hand appearances. The experimental results show that the proposed approach outperforms the state-of-the-art on public challenging hand detection benchmarks.
Highlights
The human hand plays a very important role in communication when people interact with each other and with the environment in everyday life
Detecting human hand reliably [1] from single color images or videos which are captured from common image sensors plays an important role in many computer vision-related applications, such as human–computer interaction [2,3], human hand pose estimation [4,5,6], human gesture recognition [7,8], human activity analysis [9], and so on
In this paper we study the hand detection in unconstrained cluttering environment which is a challenging task because of complex appearance diversities of human hands
Summary
The human hand plays a very important role in communication when people interact with each other and with the environment in everyday life. Detecting human hand reliably [1] from single color images or videos which are captured from common image sensors plays an important role in many computer vision-related applications, such as human–computer interaction [2,3], human hand pose estimation [4,5,6], human gesture recognition [7,8], human activity analysis [9], and so on. Hand pose estimation and action recognition were the most challenging steps (or bottlenecks) in the pipeline even in the constrained environment (normally only single hand and simple background in an image) from which the hand can be detected or assumed to be already cropped. Nowadays hand pose estimation and gesture/action recognition in the constrained environments are reaching the mature level, Sensors 2020, 20, 192; doi:10.3390/s20010192 www.mdpi.com/journal/sensors
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.