Deep learning computer vision for robotic disassembly and servicing applications

Daniel P Brogan,Nicholas M Difilippo,Musa K Jouaneh

doi:10.1016/j.array.2021.100094

Daniel P Brogan, Nicholas M Difilippo + Show 1 more

Open Access

https://doi.org/10.1016/j.array.2021.100094

Copy DOI

Abstract

Fastener detection is a necessary step for computer vision (CV) based robotic disassembly and servicing applications. Deep learning (DL) provides a robust approach for creating CV models capable of generalizing to diverse visual environments. Such DL CV systems rely on tuning input resolution and mini-batch size parameters to fit the needs of the detection application. This paper provides a method for determining the optimal compromise between input resolution and mini-batch size to determine the highest performance for cross-recessed screw (CRS) detection while utilizing maximum graphics processing unit resources. The Tiny-You Only Look Once v2 (Tiny-YOLO v2) DL object detection system was chosen to evaluate this method. Tiny-YOLO v2 was employed to solve the specialized task of detecting CRS which are highly common in electronic devices. The method used in this paper for CRS detection is meant to lay the ground-work for multi-class fastener detection, as the method is not dependent on the type or number of object classes. An original dataset of 900 images of 12.3 MPx resolution was manually collected and annotated for training. Three additional distinct datasets of 90 images each were manually collected and annotated for testing. It was found an input resolution of 1664 x 1664 pixels paired with a mini-batch size of 16 yielded the highest average precision (AP) among the seven models tested for all three testing datasets. This model scored an AP of 92.60% on the first testing dataset, 99.20% on the second testing dataset, and 98.39% on the third testing dataset.

Full Text