Abstract

This work presents a technique for recognizing indoor home scenes by using object detection. The object detection task is achieved through pre-trained Mask-RCNN (Regional Convolutional Neural Network), whilst the scene recognition is performed through a Convolutional Neural Network (CNN). The output of the Mask-RCNN is fed in input to the CNN, as this provides the CNN with the information of objects detected in one scene. So, the CNN recognizes the scene by looking at the combination of objects detected. The CNN is trained using the various object detection outputs of Mask-RCNN. This helps the CNN learn about the various combinations of objects that a scene can have. The CNN is trained using 500 combinations of 5 different scenes (bathroom, bedroom, kitchen, living room, and dining room) of the indoor home generated by Mask-RCNN. The trained network was tested on 24,000 indoor home scene images. The final accuracy produced by the CNN is 97.14%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call