Abstract

As a virtual restoration tool, digital imaging is widely used to share the patrimony and preserve the original material of ancient documents. However, the captured raw images usually consist of multiple backgrounds, influencing subsequent image processing, especially the damage investigation process. In this study, an end-to-end method named PLM-SegNet was proposed for the palm-leaf manuscript (PLM) segmentation based on U-Net. Two cameras (Nikon and Sony) were used to capture 83 palm-leaf manuscript images. The images were labeled by the software Labelme and were then cropped into patches with a specific size to train, validate, and test the PLM-SegNet model. The patch was fed into PLM-SegNet, and the foreground distribution map of this patch was obtained. The foreground distribution map of each patch in an image was predicted and stitched together into one global foreground distribution map. With assistance from the distribution map, the PLM can be segmented from the image easily. The results on two independent test sets showed pixel accuracy of 99.73% and 98.36%, intersection over union (IoU) of 99.42% and 98.31%, Recall of 99.68% and 99.95%, and F1-Score of 99.70% and 99.15% could be achieved, respectively. Additionally, damage detection was adopted as a case study to show the significance of PLM-SegNet. Compared with the raw PLM images, the performance of damage detection on segmented PLM images was improved by 15.00% and 19.33% on F1-Score and IoU, respectively. The results show that PLM-SegNet is a precise and automated method for palm-leaf manuscript segmentation when the labeled training data is limited. The source code is available at https://github.com/Ryan21wy/PLM-SegNet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call