Abstract

This thesis addresses the problem of automatic recognition of any hand written symbol. The number of different styles of handwritten symbols demonstrates the difficulties that an automatic recognizer must cope with. The main technical approaches for solving the problems in recognizing patterns are statistical pattern recognition and structural pattern recognition. Statistical pattern recognition systems use pixel-features for recognition. Some of these features are moments, histograms, Fourier transforms, and the percentage of ink pixels in different zones. Although statistical pattern recognition techniques, including Artificial Neural Networks (ANNs), carry a high recognition rate, there are some disadvantages that are in these systems. Disadvantages include the requirements of a very large number of training data, and the inability to justify its answer. As opposed to statistical pattern recognition techniques, structural pattern recognition techniques extract commonly used descriptions of the patterns (structural-features). These features include loops, end points, and arcs. In this research, the structural pattern recognition approach was used for developing an expert system that extracts structural-features (descriptions) from the symbol at each stage of recognition. The developed system enabled us to automatically recognize handwritten symbols, assuming that the symbols are in their isolated forms. To obtain a representation of a symbol the system performs four basic steps. First, the system adjusts the symbol by rotating it around its central point until its principal axis aligns with the vertical axis or having a multiple of 20° to the vertical axis. Second, the system scales the symbol to a predefined size. The third step is to thin the symbol. A novel rule-based system for thinning is developed in this research. The resultant thinned image is composed of the central lines of the image. Finally, the last step involves extracting and describing the thinned symbol in terms of strokes. These strokes will be approximated by a set of line segments. The resulting representation of the symbol is compared with different stored models of the different symbols in the system knowledge base. For each symbol many models are stored. The system was tested with 5726 handwritten English characters. When the system learned an average of 97 models per symbol and used a low threshold, the recognition rate was 95% and the rejection rate was 16.1%. The recognition rate of our system can be increased by storing more models for each symbol or by increasing the rejection rate. The system is capable of learning new symbols by simply adding models for these symbols to the system knowledge base. (Abstract shortened by UMI.)

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call