Abstract

Object recognition is fundamental to some high-level computer vision tasks such as image segmentation, object tracking and behavior analysis. The main objective of object recognition is to answer whether a specified object exists in a given image, so extracting representative features from images and training a right classifier become the key techniques in this area. In this paper, we use convolutional neural network model to learn features from RGB-D dataset which are then given to a linear SVM classifier to classify objects. As the number of images in RGB-D dataset is not big enough to retrain a deep neural network with high feature extraction accuracy, we fine-tune the caffe model which was trained on approximately 1.2 million RGB images from ImageNet database. While the framework of depth image is intrinsically different form RGB image, we transform the depth image into three channels and use the same method with the RGB image to extract features. We can achieve a classification accuracy of 91.35% which is much better than the state of the art.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call