Abstract

Tamil is an ancient language that has a vast collection of literature written on palm leaves and other materials. Palm leaf manuscripts have been used as a versatile medium to record information on medicine, literature, theatre, and other subjects. Despite the need for digitization and transcription, recognizing cursive characters in palm leaf manuscripts remains a challenging task. This study introduces a novel Convolutional Neural Network (CNN) technique to train the characteristics of palm leaf characters, enabling CNN to significantly classify palm leaf characters during the training phase. Preprocessing of the input image is done using morphological operations to remove noise. Connected component analysis is a technique used in image processing to identify and label the individual connected regions, or components, in a binary image. Connected component Analysis is then used to segment the palm leaf characters, with feature processing including text line spacing, spacing without obstacle, and spacing with an obstacle. Finally, the extracted cursive characters are input into the CNN technique for final classification. Experiments are conducted using collected cursive Tamil palm leaf manuscripts to validate the performance of the proposed CNN with existing deep learning techniques in terms of accuracy, precision, recall, etc.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call