Abstract

In recent years, as the importance of scene text detection and recognition in real life has gradually become prominent, its research methods have become mature, and more attention has been placed on improving accuracy. Due to the complex background in natural scenes, different font arrangements, lighting, etc., various methods are difficult to meet the requirements of the application. In this paper, the detection and recognition of curved text are studied, and the research method based on the ESIR framework[1] is improved, specifically in the following two aspects: (1) TPS transform fiducial points are predicted by the positioning network, and the positioning network uses CNN to return to the required fiducial points[2] The x and y coordinates. (2) Using ACE (aggregated cross entropy) method for sequence recognition, its loss function is realized faster and it occupies less memory than CTC and attention mechanism, and it performs well in the one-dimensional prediction of scene text recognition. In this paper, the irregular data set is sampled by reference points, and the parameter matrix T of TPS is calculated to transform the points in the picture, and cyclically transform until the image text is level, and then use the CNN-BLSTM model to perform one-dimensional prediction and output the recognition results. The selected data set is irregular text data set: for example, SVT-Perspective, CUTE80.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call