Abstract

We propose a large-scale Hangul font recognizer that is capable of recognizing 3300 Hangul fonts. Large-scale Hangul font recognition is a challenging task. Typically, Hangul fonts are distinguished by small differences in detailed shapes, which are often ignored by the recognizer. There are additional issues in practical applications, such as the existence of almost indistinguishable fonts and the release of new fonts after the training of the recognizer. Only a few recently developed font recognizers are scalable enough to recognize thousands of fonts, most of which focus on the fonts for western languages. The proposed recognizer, HanFont, is composed of a convolutional neural network (CNN) model designed to effectively distinguish the detailed shapes. HanFont also contains a font clustering algorithm to address the issues caused by indistinguishable fonts and untrained new fonts. In the experiments, HanFont exhibits a recognition rate of 94.11% for 3300 Hangul fonts including numerous similar fonts, which is 2.49% higher than that of ResNet. The cluster-level recognition accuracy of HanFont was 99.47% when the 3300 fonts were grouped into 1000 clusters. In a test on 100 new fonts without retraining the CNN model, HanFont exhibited 57.87% accuracy. The average accuracy for the top 56 untrained fonts was 75.76%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call