Abstract

Most traditional postcode recognition systems implicitly assumed that the distribution of the 10 numerals (0–9) is balanced. However it is far from a reasonable setting because the distribution of 0–9 in postcodes of a country or a city is generally imbalanced. Some numerals appear in more postcodes, while some others do not. In this paper, we study cost-sensitive neural network classifiers to address the class imbalance problem in postcode recognition. Four methods, namely: cost-sampling, cost-convergence, rate-adapting and threshold-moving are considered in training neural networks. Cost-sampling adjusts the distribution of the training data such that the costs of classes are conveyed explicitly by the appearances of their instances. Cost-convergence and rate-adapting are carried out in training phase by modifying the architecture of training algorithms of the neural network. Threshold-moving tries to increase the probability estimations of expensive classes to avoid the samples with higher costs to be misclassified. 10,702 postcode images are experimented using five cost matrices based on the distribution of numerals in postcodes. The results suggest that cost-sensitive learning is indeed effective on class imbalanced postcode analysis and recognition. It also reveals that cost-sampling on a proper cost matrix outperforms others in this application.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.