Abstract

In automatic text categorization procedure, quantifiable features’ information is extracted from a text and on the basis of the information the text is sorted as a category. This information consists of values of set of one or more measurements, where the measurements can be considered as frequencies or function of frequencies of linguistic elements. In the process of text classification and genre discrimination, the role of the systematic study of word length and the analyses of word-length statistics of different texts has been established by researchers for various languages. In the present paper an attempt has been made to test the contribution of quantitative word length features in classification of written texts of Hindi Language by extracting quantitative measures with the help of word length profiles and frequencies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call