Abstract

Stylistic text can be found on sign boards, street and organizations boards and logos, bulletin boards, announcements, advertisements, dangerous goods plates, warning notices, etc. In stylistic text images, text-lines within an image may have different orientations such as curved in shape or not be parallel to each other. As a result, extraction and subsequent recognition of individual text-lines and words in such images is a difficult task. In this paper, we propose a novel scheme for straightening of curved text-lines using the concept of dilation, flood-fill, robust thinning, and B-spline curve-based fitting. In the proposed scheme, at first, dilation is applied on individual text-lines to cover the area within a certain boundary. Next, thinning is applied to get the path of the text, approximate the path using the B-spline, find the angle between the normal at a point on the curve and the vertical line, and finally visit each point on the text and rotate by their corresponding angles. The proposed methodology is tested on variety of text images containing text-lines in Devanagari, English, and Chinese scripts which is evaluated on the basis of visual perception and the mean square error (MSE) calculation. MSE is calculated by line fitting applied on input and output images. On the basis of evaluation results obtained in our experiments, the proposed method is promising.

Highlights

  • A large volume of research effort has been dedicated to OCR systems

  • Numbers of algorithms [1,2,3,4,5,6] are available for this purpose, and many commercial OCR systems [7,8] are available in the market but most of these systems can recognize only text images having straight text-lines and designed only for a specific script or language

  • The main advantage of proposed approach is that no specific feature extraction and classification techniques or dataset is required in the recognition of such documents

Read more

Summary

Introduction

A large volume of research effort has been dedicated to OCR systems. Numbers of algorithms [1,2,3,4,5,6] are available for this purpose, and many commercial OCR systems [7,8] are available in the market but most of these systems can recognize only text images having straight text-lines (horizontal) and designed only for a specific script or language. In 2000, one work on English stylistic text recognition is due to Adam et al [21] in which an approach of recognition of multi-oriented and multi-scaled character in engineering drawings is proposed. In 2005, Pal et al [24] proposed a recognition-based approach to handle Indian multi-oriented and curved text.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call