Implementation of Words and Characters Segmentation of Gujarati Script Using MATLAB

B Shah Gargi,S Sajja Priti

doi:10.1007/978-3-030-38040-3_29

Abstract

From past decades, many research works have been carried out for identifying characters from images in Gujarati language. However, a few solutions are available for segmentation of words and characters in Gujarati language from a text file. In this paper, we propose a novel algorithm which considers input from a text file containing Gujarati script and segments words and characters. The proposed algorithm has been validated with 10 numbers written in words, each with varying length, implemented using MATLAB. The screenshots are attached with this paper, which shows segmentation of words and characters. This system is considered as a subsystem of a much bigger system, where Gujarati text is to be converted into to its equivalent speech. Without proper segmentation of words and characters, it is not possible to achieve the intended task. Prior to this, existing solutions and tools are analysed and the summary is presented in the literature survey. Based on the limitations found through the survey, the proposed research work is designed. An experiment is also carried out and discussed in detail in this paper along with the results achieved. At end, conclusion and possible future enhancements are also presented.

Full Text