Abstract

In-air handwriting based on a monocular camera is an innovative and promising modality for human–computer interaction, offering a plethora of potential applications. However, existing in-air handwriting systems based on monocular camera suffer a significant challenge in determining the spatial coordinates of fingertips using two-dimensional images from a monocular camera. Since the size of the fingertip is very small and has very few discriminative features. To tackle this challenge, we propose a Multi-Scale Channel Attention Network (MSCAN). Through weighting multi-scale channels, the MSCAN facilitates the concentration of target detection models on high-resolution, small-scale channels, thus enhancing fingertip localization precision. We integrate the MSCAN with the YOLOv5s model to realize a novel in-air handwriting system based on monocular vision. We conducted comparative experiments on a self-constructed fingertip dataset and several publicly available small-target datasets. Experimental results show that the proposed method can effectively improve the accuracy of fingertip and small-target detection. The detection rate of fingertips reaches 98%, indicating that the proposed in-air handwriting system enables users to write freely and smoothly.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call