Abstract
Object detection is one of the most critical tasks in the field of Computer vision. This task comprises identifying and localizing an object in the image. Architectural floor plans represent the layout of buildings and apartments. The floor plans consist of walls, windows, stairs, and other furniture objects. While recognizing floor plan objects is straightforward for humans, automatically processing floor plans and recognizing objects is challenging. In this work, we investigate the performance of the recently introduced Cascade Mask R-CNN network to solve object detection in floor plan images. Furthermore, we experimentally establish that deformable convolution works better than conventional convolutions in the proposed framework. Prior datasets for object detection in floor plan images are either publicly unavailable or contain few samples. We introduce SFPI, a novel synthetic floor plan dataset consisting of 10,000 images to address this issue. Our proposed method conveniently exceeds the previous state-of-the-art results on the SESYD dataset with an mAP of 98.1%. Moreover, it sets impressive baseline results on our novel SFPI dataset with an mAP of 99.8%. We believe that introducing the modern dataset enables the researcher to enhance the research in this domain.
Highlights
We present an end-to-end trainable framework that works on Cascade Mask RCNN [15] with conventional and deformable [16] convolutional backbone network to detect various objects in floor plan images
Our backbone ResNeXt-101 [17] is pre-trained on MS-COCO dataset [26]. Using this pre-trained feature extraction backbone helps our architecture to adapt from the domain of natural scenes to floor plan images
We achieve a 0.995 Mean Average Precision (mAP) score and 0.997 Mean Average Recall (mAR) score. This clearly shows that our model performs better on the SFPI dataset where we have sufficient images to train a model as compared to less number of images we have in SESYD [3]
Summary
Architectural floor plans contain both structural and semantic information, e.g., room size, type, location of doors, walls, and furniture [3]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.