Abstract
AbstractHand gesture segmentation is an initial and essential step to classify hand gestures, which provides a simple, intuitive, concise and natural way for human–computer interaction, human–robot interaction. However, hand gestures segmentation with various hand shapes cluttered background is still a challenging problem. To solve the problem, a Multi-Branch Cascade Transformer Network (MBCT–Net) is proposed to segment hand regions from the cluttered background based on encoder-decoder convolutional neural networks, the encoder of the MBCT–Net consists of a deep convolutional neural network (DCNN) module and a multi-branch cascade Transformer (MBCT) module. Furthermore, the MBCT module is designed to represent local details and global semantic information of hand gestures. Moreover, to enhance semantical interaction between different windows and expand the receptive fields of MBCT-Net, we design a multi–window self-attention (MWSA) block in each branch of MBCT module to extract features of hand gestures. The MWSA block not only reduces the amount of calculation, but also enhances semantic interactions between different windows. To verify effectiveness of the proposed MBCT–Net, corresponding experiments have been conducted, and the experimental results prove correctness of the MBCT–Net.KeywordsHand gesture segmentationDeep learningTransformer
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.