Abstract

This paper presents a context-based semantic indoor scene recognition method used for an autonomous mobile robot. The proposed method comprises context-based feature representation and unsupervised-learning-based scene recognition. We use Gist as background feature representation and scale-invariant feature transform (SIFT) with the color space of hue, saturation, and value (HSV) as foreground feature representation. For recognition, the proposed method actualizes the visualization of categorical boundaries and their relations based on unsupervised learning frameworks. We used KTH-IDOL2 benchmark datasets to evaluate our method in comparison with a method using SIFT. Mean recognition accuracies of HSV-SIFT and SIFT were, respectively, 57.6% and 44.7% for five categories evaluated using one-leave-out cross validation. Comparison of both results demonstrates that the accuracy of HSV-SIFT is 28.9 percentage points higher than that of SIFT. For analyzing category maps using unified distance matrix (U-Matrix), categorical boundaries of HSV-SIFT were extracted clearly compared with those of SIFT. Moreover, we demonstrate that HSV-SIFT obtains clusters that are the same number of ground truth (GT) without fragmentation. Furthermore, we demonstrate that representative images are distributed on respective clusters according to appearance features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call