Abstract

The traditional approach for solving the object recognition problem requires image representations to be first extracted and then fed to a learning model such as an SVM to learn the classification decision boundary. These representations are handcrafted and heavily engineered by running the object image through a sequence of pipeline processes that require a good prior knowledge of the problem domain. However, in end-to-end deep learning models, image representations along with classification decision boundary are all learnt directly from the raw image pixels requiring no prior knowledge of the problem domain. Moreover, the deep model features are more discriminative than handcrafted ones since the model is trained to discriminate between features belonging to different classes. The purpose of this study is six fold: (1) review the literature of the pipeline processes used in the previous state-of-the-art codebook model approach for tackling the problem of generic object recognition, (2) Introduce several enhancements in the local feature extraction and normalization processes of the recognition pipeline, (3) compare the enhancements proposed to different encoding methods and contrast them to previous results, (4) experiment with current state-of-the-art deep model architectures used for object recognition, (5) compare between deep representations extracted from the deep learning model and shallow representations handcrafted by an expert and produced through the recognition pipeline, and finally, (6) improve the results further by combining multiple different deep learning models into an ensemble and taking the maximum posterior probability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.