Abstract

This paper proposes a new method to extract multilingual text in images through discriminating characters from non-characters based on the Gaussian mixture modeling of neighbor characters. The image is binarized and the morphological closing operation is performed on the binary image, in order that each character in it can be treated as a connected component; the neighborhood of connected components are computed based on the Voronoi partition of the image, and each connected component is labeled as character or non-character according to its neighbors. We applied the proposed text extraction method to Chinese and English text extraction, the effectiveness of which is confirmed by the experimental results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call