Abstract

Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call