Abstract

An important aspect of automatic floor plan analysis is the extraction of textual information, as it is essential for a thorough understanding of the drawing. This paper presents a text extraction approach utilizing a deep learning-based object detection model and state-of-the-art Optical Character Recognition (OCR) methods. The paper contributes to the research community in three ways: First, it introduces additional annotations to existing data sets to encompass text elements. Second, it proposes a specialized data synthesis pipeline, allowing for generating training images that mimic important characteristics of real data. Finally, it documents a comparative study of deep learning-based object detection architectures (Tesseract, EAST, CRAFT, Faster R-CNN, YOLOv5, YOLOR, YOLOv7, and YOLOv8) and OCR tools (PARSEq, MATRN, EasyOCR, and Tesseract) for the task. Results indicate that YOLOv7 yields the best text detection performance (up to 97.5% wmAP) and PARSEq excels in character recognition (85.2% CER). The data sets are made available.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call