Abstract

Human-Robot Interaction (HRI) constitutes a demanding research field that integrates artificial intelligence, informatics, robotics, engineering, and human-machine interaction. In the present era, there is an increased focus on natural user interfaces, with particular attention to gestural modalities. This leads to the coordination of modern robotic systems, where real-time integration and interpretation of manual gestures play a vital role in facilitating effective manipulation. Hand gesture recognition holds a crucial role in the field of machine vision, perhaps greatly exacerbated with varying lighting conditions and backgrounds. In this paper, first an Enhanced Anchor-free Network (EAF-Net) is proposed to perform hand gesture recognition in real-time. The EAF-Net is deep single-stage Convolutional Neural Network (CNN) based architecture. The Custom Precise Prediction Function (CPPF) is utilized for a continuous recognition of a specific gesture from a video feed. Then, the 6-axis collaborative Robot is manipulated using the predicted gesture. The EAF-Net model utilizes the Enriched Hourglass as a backbone feature extraction network. The proposed EAF-Net model undergoes training and assessment using the MITI HD-II dataset. The performance of the model is also analyzed for the standard benchmark datasets (NUSHP-II and Senz-3D). The evaluation of the EAF-Net model’s performance spans a range of IoU values from 0.5 to 0.95. The EAF-Net model led to higher precision values (AP0.5) as 99.22%, recall values (AR0.5) as 98.20%, and F1-Score0.5 as 98.64% on the MITI HD-II dataset. The EAF-Net model consumed a prediction time of 259 ms on NVIDIA Jetson Nano Processor and 14 ms on NVIDIA Titan X GPU. It aided in implementing a portable high-speed 6-axis Robot control framework to pick and place operation in 3D space. The robot would be able to imitate human hand activities such as hand rotation, hand tilting, picking and arranging items, and so on in the near future. Further reduction in prediction time of the model on the portable processors (NVIDIA Jetson Nano, Raspberry-Pi) could enhance the speed of the Human Machine Interaction process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call