Abstract

With advances in machine vision systems (e.g., artificial eye, unmanned aerial vehicles, surveillance monitoring) scene semantic recognition (SSR) technology has attracted much attention due to its related applications such as autonomous driving, tourist navigation, intelligent traffic and remote aerial sensing. Although tremendous progress has been made in visual interpretation, several challenges remain (i.e., dynamic backgrounds, occlusion, lack of labeled data, changes in illumination, direction, and size). Therefore, we have proposed a novel SSR framework that intelligently segments the locations of objects, generates a novel Bag of Features, and recognizes scenes via Maximum Entropy. First, denoising and smoothing are applied on scene data. Second, modified Fuzzy C-Means integrates with super-pixels and Random Forest for the segmentation of objects. Third, these segmented objects are used to extract a novel Bag of Features that concatenate different blobs, multiple orientations, Fourier transform and geometrical points over the objects. An Artificial Neural Network recognizes the multiple objects using the different patterns of objects. Finally, labels are estimated via Maximum Entropy model. During experimental evaluation, our proposed system illustrated a remarkable mean accuracy rate of 90.07% over the MSRC dataset and 89.26% over the Caltech 101 for object recognition, and 93.53% over the Pascal-VOC12 dataset for scene recognition, respectively. The proposed system should be applicable to various emerging technologies, such as augmented reality, to represent the real-world environment for military training and engineering design, as well as for entertainment, artificial eyes for visually impaired people and traffic monitoring to avoid congestion or road accidents.

Highlights

  • Visual sensor [1] technology is one of the most significant human sensing tools that attains content awareness from the surroundings by exploiting the statistical dependencies of detected objects [2]

  • We propose a robust method for multiple objects and scene recognition based on Artificial Neural Network and Maximum Entropy model, which can successfully predict the semantic labels of objects in different scenes

  • In this paper, we proposed a novel and effective framework that robustly segments the location of objects, generates a new Bag of Features, and recognizes complex scene scenarios using five steps

Read more

Summary

INTRODUCTION

Visual sensor [1] technology is one of the most significant human sensing tools that attains content awareness from the surroundings by exploiting the statistical dependencies of detected objects [2]. We propose a novel SSR model that integrates modified Fuzzy C-Means [16] and Random Forest to segment single/multiple objects [17] Different features such as discrete Fourier transform, blob extraction, multiple orientation and geometrical shape are merged to develop a Bag of Features. Artificial Neural Network [18] recognizes single/multiple objects based on extracted features These objects estimate the posterior of the class label by employing the Maximum Entropy [19] method for scene classification.

RELATED WORK
MULTIPLE OBJECT SEGMENTATION
EXPERIMENTAL SETUP AND RESULTS
DISCUSSIONS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.