Abstract

We study the problem of 3D pose estimation of textureless shiny objects from monocular 2D images, for a bin-picking task. The main challenge of dealing with a shiny object comes from the fact that the object appearance largely changes with its pose and illumination. Therefore, conventional 3D-2D correspondence search usuallyfails due to the inconsistency of feature descriptors. For a textureless object like a mechanical part, visual feature matching becomes even harder due to the absence ofstable texture features. Hierarchical template matching approaches require a larger number of templates to be matched when dealing with shiny objects, due to the drasticappearance changes with pose. In the challenging scenario of a bin-picking task, we must also cope with partial occlusions, shadows and inter-reflections, requiringredoubled eff ort in matching each template to obtain reliable results, which compromises the attractiveness of such approaches that are usually popular for texturelessobjects. In this thesis, we develop a purely data-driven method to tackle the pose estimation problem. Motivated by photometric stereo, we develop an imaging system withmultiple lights to acquire a multi-light image where channels are obtained by varying illumination directions. In an oine stage, we capture multi-light images of a givenobject in several poses. Then, we use random ferns to cluster the appearance of small patches of the multi-light images, and we store in each cluster the information of possible object poses. At run-time, the patches of the input multi-light image use the clusters information to probabilistically vote on several pose hypotheses. Since ourpose hypotheses are a discrete set, we re fine the discretized pose into the continuous space, in order to obtain accurate object poses for robotic manipulation.Experiments show that the given method can detect and estimate poses of textureless and shiny objects accurately and robustly within half a second. We furthercompare our approach with the HALCON commercial software, a highly optimized hierarchical template matching approach developed by MVTec, and show some ofthe drawbacks of such type of approaches. Finally, we run detection on a different object by simply changing the image database.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call