An Approach of Hidden Markov Model for Offline Yoruba Handwritten Word Recognition

Jumoke F Ajao,Stephen O Olabiyisi,Elijah O Omidiora,Oladayo O Okediran

doi:10.9734/bpi/ctmcs/v2/1659e

Abstract

This paper presents a recognition system for \(Yor\grave{u}b\acute{a}\) handwritten words using Hidden Markov Model(HMM). The work is divided into four stages, namely data acquisition, preprocessing, feature extraction and classification. Data were collected from adult indigenous writers and the scanned images were subjected to some level of preprocessing, such as: greyscale, binarization, noise removal and normalization accordingly. Features were extracted from each of the normalized words, where a set of new features for handwritten \(Yor\grave{u}b\acute{a}\) words is obtained, based on discrete cosine transform approach and zigzag scanning was applied to extract the character shape, underdot and the diacritic sign from spatial frequency of the word image. The \(Yor\grave{u}b\acute{a}\) handwritten words were subjected to some level of preprocessing to enhance its quality and Discrete Cosine Transform was used to extract the features of the \(Yor\grave{u}b\acute{a}\) handwritten image. A ten(10) state left-to-right HMM was used to model the \(Yor\grave{u}b\acute{a}\) words. The initial probability of HMM was randomly generated based on the model created for \(Yor\grave{u}b\acute{a}\) alphabet. In the HMM modeling, one HMM per each class of the image feature was constructed. The Baum-Welch re-estimation algorithm was applied to train each of the HMM class based on the DCT feature vector for the handwritten word images. Viterbi algorithm was used to classify the handwritten word which, gave the corresponding state sequences that best describe the model. Our experiments reported the highest test accuracy of 92\% and higher recognition rate of 95.6\% which, indicated that the performance of the recognition system is very accurate.

Full Text