The Dunhuang ancient manuscripts are an excellent and precious cultural heritage of humanity. However, due to their age, the vast majority of these treasures are damaged and fragmented. Faced with a wide range of sources and numerous fragments, the process of restoration generally involves two core elements: sibling fragments identification and fragment assembly. Currently, fragment restoration still heavily relies on manual labor. During the long practice, a consensus has been reached on the importance of edge features for not only assembly but also for identification. However, accurate extraction of edge features and their use for efficient identification requires extensive knowledge and strong memory. This is a challenge for the human brain. So that in previous studies, fragment edge features have been used for assembly validation but rarely for identification. Therefore, an edge matcher is proposed, working like a bloodhound, capable of “sniffing out” specific “flavors” in edge features and performing efficient sibling fragment identification accordingly, providing guidance when experts perform entity assembly subsequently. Firstly, the fragmented images are standardized. Secondly, traditional methods are used to compress the representation of fragment edges and obtain paired local edge images. Finally, these images are fed into the edge matcher for classification discrimination, which is a CNN-based pairwise similarity metric model proposed in this paper, introducing residual blocks and depthwise separable convolutions, and adding multi-scale convolutional layers. With the edge matcher, a complex matching problem is successfully transformed into a simple classification problem. In the absence of a standard public dataset, a Dunhuang manuscript fragment edge dataset is constructed. Experiments are conducted on that dataset, and the accuracy, precision, recall, and F1 scores of the edge matcher all exceeded 97%. The effectiveness of the edge matcher is demonstrated by comparative experiments, and the rationality of the method design is verified by ablation experiments. The method combines traditional methods and deep learning methods to creatively use the edge geometric features of fragments for sibling fragment identification in a natural rather than coded way, making full use of the computer’s computational and memory capabilities. The edge matcher can significantly reduce the time and scope of searching, matching, and inferring fragments, and assist in the reconstruction of Dunhuang ancient manuscript fragments.
Read full abstract