Abstract
AbstractGrasping detection, which involves identifying and assessing the grasp ability of objects by robotic systems, has garnered significant attention in recent years due to its pivotal role in the development of robotic systems and automated assembly processes. Despite notable advancements in this field, current methods often grapple with both practical and theoretical challenges that hinder their real‐world applicability. These challenges encompass low detection accuracy, the burden of oversized model parameters, and the inherent complexity of real‐world scenarios. In response to these multifaceted challenges, a novel lightweight grasping detection model that not only addresses the technical aspects but also delves into the underlying theoretical complexities is introduced. The proposed model incorporates attention mechanisms and residual modules to tackle the theoretical challenges posed by varying object shapes, sizes, materials, and environmental conditions. To enhance its performance in the face of these theoretical complexities, the proposed model employs a Convolutional Block Attention Module (CBAM) to extract features from RGB and depth channels, recognising the multifaceted nature of object properties. Subsequently, a feature fusion module effectively combines these diverse features, providing a solution to the theoretical challenge of information integration. The model then processes the fused features through five residual blocks, followed by another CBAM attention module, culminating in the generation of three distinct images representing capture quality, grasping angle, and grasping width. These images collectively yield the final grasp detection results, addressing the theoretical complexities inherent in this task. The proposed model's rigorous training and evaluation on the Cornell Grasp dataset demonstrate remarkable detection accuracy rates of 98.44% on the Image‐wise split and 96.88% on the Object‐wise split. The experimental results strongly corroborate the exceptional performance of the proposed model, underscoring its ability to overcome the theoretical challenges associated with grasping detection. The integration of the residual module ensures rapid training, while the attention module facilitates precise feature extraction, ultimately striking an effective balance between detection time and accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.