Abstract

Deep learning text detection is generally divided into two steps: prediction candidate box of depth model and post-processing, and post-processing usually uses NMS or prediction box to merge and connect. At present, many methods of text detection can detect the character position of the document image, but they are often not accurate enough, which affects the effect of subsequent recognition. To solve the problem of inaccurate detection, this article proposes a post-processing method of document detection based on geometric features from the perspective of post-processing. The post-processing method is mainly divided into four modules. Firstly, the background removal (BR) module separates the background information and character information through pixel threshold. Secondly, the candidate box expansion (CBE) module expands the prediction box in all directions by judging whether the boundary of the prediction box is in the character pixel. Then is the non-standard box removal (NBR) module, using the consistency principle of characters and surrounding characters to filter out the error detection of some prediction boxes. Finally, the module of repeating box removal (RBR) is used to remove the repeated prediction box. In order to verify the effectiveness of this method, a large number of experiments have been conducted on Standard yi, Chinese2k, English2k, ICDAR 2015 and ICDAR 2017(CTW-12k) datasets. The experimental results show that the method proposed in this article can improve the effect of text detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call