Abstract

Line segmentation can be a useful process for further text segmentation. There are some certain line segmentation framework that use binarization method as an initial step. But binarization process is still facing a major challenge, especially on old document palm-leaf manuscripts. As the quality of the image has varying degrees of noises in the non-text region. Seam Carving method, one of line segmentation methods that uses binarization-free approach, can be an alternative solution. However, this method can separate the incorrect text line on small element text located at the bottom or at the top of a main character contour. Therefore, an improvement on line segmentation framework is proposed by using hybrid binarization and its implemented on the smallest energy function to separate out the text-lines. The proposed framework have been evaluated on 44 Sundanese old manuscript images that consist of true color and binary images. The evaluation matrix shows that this framework can improve Niblack binarization process up to 50%. In addition, our framework does not only generate the number of text-lines to come near to the number of target lines, but it also can separate the text-lines well on small element text. Overall, the expected result can in the end be produced from the proposed line segmentation framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call