Abstract

Hand gesture recognition on the depth videos is a promising approach for automotive interfaces because it is less sensitive to light variation and more accurate than other traditional methods. However, video gestures recognition is still a challenging task since lots of interferences are induced by the uncorrelated gesture factors. Considering that if the displays are more relevant, the results will more accurate, so ResNext, a kind of compact and efficient neural network, is firstly used as feature extractor, then an improved weighted frame unification method is adopted to obtain the key frame samples, finally the Discriminant correlation analysis (DCA) is employed to fuse features for static data and dynamic data after conducting Feature embedding branch (FEB) on static data. The public dataset named Depth based gesture recognition database (DGRD) is used in this paper, but the dataset is a little small and the class distribution is largely imbalance, and we find the performance of ResNext degrades badly in the condition of imbalance problem although it achieves excellent result at sufficient training data. In order to conquer the disadvantages of limited dataset, a special loss function scheme combining the softmax loss and dice loss is proposed. Evaluation of the algorithm performances in comparison with other state-of-the-art methods indicates that the proposed method is more practical for gesture recognition and may be widely adopted by automotive interfaces.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call