Abstract

AbstractNewspapers contain artistic components like headings, advertisements and different stylish posters, which are printed in such a way that text lines are oriented in differently and contain multi-font curved images. WordArt in Microsoft office is one such widely used tools for text multi-orientation, which can also stretch, skew, bend or modify the shape of the text. This type of text is not horizontal and stylized through multi-font properties which makes it difficult to recognize through conventional optical character recognition (OCR) systems. In this paper, a novel technique is proposed which transforms multi-line and multi-font curved text images into single font horizontal format. This method uses vertical and horizontal projection bar threshold techniques to divide text through line segmentation and word segmentation. Each character is then individually reshaped through its centroid in order to align it with the vertical axis. This approach yields high recognition accuracy for text strings which OCR fails to recognize before alignment.KeywordsVertical projectionHorizontal projectionSegmentationCentroidTranslationRotation

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.