Abstract

Handwritten document image dataset development is one of the most tedious and time consuming tasks in optical character recogniser OCR related experimental work. Special attention need to be given in terms of feasibility, realness, clarity etc. while collecting real life data from different writers. Few efforts can be found in the literature for development of handwritten NIdb numeral image dataset but they were restricted on single script which is a local script of the fellow researcher who prepared the database. In this paper, an approach to develop word-level handwritten NIdb of four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu has been proposed. Benchmark result is developed with respect to handwritten numeral script identification HNSI problem by applying a novel image transform fusion ITF based technique. The proposed dataset will be freely available to the researchers for non-commercial use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call