An Enhanced Prototypical Network Architecture for Few-Shot Handwritten Urdu Character Recognition

Rajat Sahay,Mickaël Coustaty

doi:10.1109/access.2023.3263721

Abstract

Few shot models have started to gain a lot of popularity in the past few years. This is mostly because these models grant the ability to structure the representation space (classes) using a very less amount of examples for each class. Such models are usually trained on a wide range of different classes and their examples, which allows them to form and learn a decision-based metric in the process. Non-Latin languages, especially languages such as Urdu, have a bi-linear direction of writing and are context-sensitive in nature, and are hard to recognize. Also, unlike traditional English, there is a very small amount of clean, collated, and usable data that is available for the Urdu language. In this paper, we explore a prototypical network for k-shot classification on handwritten Urdu characters. The prototypical network learns the Euclidean embeddings of the provided images and uses clusters to classify newer examples. Our improved method is able to outperform other methods of few-shot learning and is able to accurately classify both Urdu characters as well as numerals using a minimal number of examples. After comprehensive qualitative and quantitative evaluation and comparison of our proposed approach with other methods to classify handwritten text in few-shot settings, we found out that our proposed approach was typically able to beat other methods by a margin of 1% – 2% while relying on a small training set.

Full Text