Abstract

In this paper, we have proposed several improved versions of rapid automatic keyword extraction (RAKE) algorithm for extracting keywords from Hindi documents. As RAKE requires a stopword list to generate the set of candidate keywords, which is unavailable in Hindi, we have constructed the Hindi stopword list for this purpose. We have found some weakness in keyword scoring measures of RAKE and proposed several models such as N-RAKE, SD-RAKE, NSD-RAKE, and WOS-RAKE to improve upon the effectiveness of RAKE. We have found that our modifications yield better results in general than original RAKE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call