Ancient Malayalam manuscripts include a vast quantity of materials, much of which pertains to ancient custom and culture. These manuscripts are the foundation of the rich and diverse culture that we possess today. The information extraction from such scripts written in “Thaliyola” or palm leaves would be an extensive and tedious task due to various challenges such as understanding the ancient script, damages generated due to miss-handling, stains and fungus and other environmental factors caused to the palm leaves. Preserving and segregating these documents through binarization and classification is an important task. Classification of these documents will lead to understanding unique ancient documents and preserving them for our current and future generations. In proposed work primarily an RGB image of the palm leaf document is given as input to the Spectral Angle Mapper(SAM) algorithm which takes the input and converts into RGB-A image with an additional Alpha channel, after FCC and TCC is calculated and a spectral image is generated which possess good text readability which is then followed by classification. The spectral model is passed to using VGG-16 model along with other binarized images, after training and testing achieved an accuracy of 90% on Jadakam and an accuracy of 85% with Bhagavatham.
Read full abstract