Abstract

We have developed an application software system for font and size invariant optical character recognition (OCR). A preliminary front-end process, which handles gray scale normalization, noise elimination, line finding, and character block segmentation, has been included in our system. But the main characteristic of our system is that we adopt a Fourier descriptors (FDs) based feature extraction approach and a multicategory back-propagation neural network classifier for the recognition. Instead of using one set of FDs to represent the image object as in conventional FDs approach, we use three sets of FDs to represent different portions of the object. We find this approach can solve the intrinsic problem of FDs caused by their rotational and reflectional invariance properties. Thus, our existing method can correctly classify ambiguous characters like 5, 2, p, q, etc. Our system will become a primitive building block for later more complex OCR systems. In general, the back propagation provides a gradient descent optimization in training. Without prior knowledge of the mathematical relation between the input and output, the network is highly efficient to map a large variety set of input patterns to an arbitrary set of output patterns after successful training. Thus, the system is not only capable of understanding printed English text and isolated cursive scripts, but also can be extended to read other symbols at the expense of additional training time. At present, a 99.8 percent rate of recognition accuracy has already been achieved for trained English samples in our experiment.© (1993) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call