Abstract

Lanna script is an archaic script not commonly used in today's world. People trying to read these archaic Lanna manuscripts have to find some form of translation help to understand what they said. Unfortunately, few people nowadays know how to read or write this language. Therefore, character recognition system must be put to use in order to translate the Lanna script to the commonly used script. The poor condition of the manuscripts and the writing style of the script make this problem very difficult to solve. The most difficult cases of the writing style problem are the touching and overlapping characters. Therefore, the first two stages of the character recognition process, which are image preprocessing and segmentation, need to be closely watched over so that the recognition accuracy is high. In this paper, two new techniques are proposed. The first proposed technique emphasizes on converting a grayscale image to a binary image. In this proposed technique, the concepts of the multithresholding method and Otsu's method are combined together. The second proposed technique emphasizes on the process of touching character segmentation. In doing this, the bounding box analysis is initially employed to segment the document image into images of isolated characters and images of touching characters. The thinning algorithm is applied to extract the skeleton of the touching characters. Next, by using the junction points as the separation points, the skeleton of the touching characters is separated into several pieces. Finally, the separated pieces of the touching characters are put back to reconstruct two isolated characters. The proposed algorithm achieves an accuracy of 86.67%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.