A heuristic algorithm for the recognition of printed Chinese characters

C.-T Chuang,L.Y Tseng

doi:10.1109/21.370205

Abstract

A heuristic algorithm for the recognition of printed Chinese characters is presented. Preprocessing consists of identifying individual straight line primitive strokes of a Chinese character, and then identifying the sequence of occurrence of these primitive strokes in the course of two orthogonal and one diagonal scans. The results of the three scans are three ordered sets of primitive strokes that can be binary encoded. These three types of codes are called feature codes. The feature codes are used in the training phase and recognition phase by hashing. An experiment that trained on 13053 characters of a single font shows that only six pairs of characters have coincident feature codes. The recognition speed of this experiment is 44.4 milliseconds of 80386 CPU time per character (1,350 characters per minute excluding disk I/O time). The recognition rate is from 97.22% to 98.4%.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

Full Text