Abstract
We introduce a non-parametric hierarchical Bayesian approach for open-ended 3D object categorization, named the Local Hierarchical Dirichlet Process (Local-HDP). This method allows an agent to learn independent topics for each category incrementally and to adapt to the environment in time. Each topic is a distribution of the visual words over a predefined dictionary. Using an inference algorithm, these latent variables are inferred from the dataset. Subsequently, the category of an object is determined based on the likelihood of generating a 3D object from the model. Hierarchical Bayesian approaches like Latent Dirichlet Allocation (LDA) can transform low-level features to high-level conceptual topics for 3D object categorization. However, the efficiency and accuracy of LDA-based approaches depend on the number of topics that is chosen manually. Moreover, fixing the number of topics for all categories can lead to overfitting or underfitting of the model. In contrast, the proposed Local-HDP can autonomously determine the number of topics for each category. Furthermore, the online variational inference method has been adapted for fast posterior approximation in the Local-HDP model. Experiments show that the proposed Local-HDP method outperforms other state-of-the-art approaches in terms of accuracy, scalability, and memory efficiency by a large margin. Moreover, two robotic experiments have been conducted to show the applicability of the proposed approach in real-time applications.
Highlights
Most recent object recognition/detection techniques are based on deep neural networks [1,2,3,4,5,6,7]
In the offline evaluations, the model is trained once with a training set and evaluated using a testing set from the dataset
We propose a non-parametric hierarchical Bayesian model called Local Hierarchical Dirichlet Process (Local-HDP) for interactive open-ended 3D object category learning and recognition
Summary
Most recent object recognition/detection techniques are based on deep neural networks [1,2,3,4,5,6,7]. These methods typically need a large labeled dataset for a long training process. The number of object categories (class labels) should be predefined in advance for such methods. Open-ended learning means that the number of categories (class labels) is not fixed and predefined for the model, and that it can grow during runtime. Object category recognition is not a well-defined problem because of the
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have