Abstract

Keyword Extraction is an important task in several text analysis endeavours. In this paper, we present a critical discussion of the issues and challenges in graph-based keyword extraction methods, along with comprehensive empirical analysis. We propose a parameterless method for constructing graph of text that captures the contextual relation between words. A novel word scoring method is also proposed based on the connection between concepts. We demonstrate that both proposals are individually superior to those followed by the sate-of-the-art graph-based keyword extraction algorithms. Combination of the proposed graph construction and scoring methods leads to a novel, parameterless keyword extraction method (sCAKE) based on semantic connectivity of words in the document.Motivated by limited availability of NLP tools for several languages, we also design and present a language-agnostic keyword extraction (LAKE) method. We eliminate the need of NLP tools by using a statistical filter to identify candidate keywords before constructing the graph. We show that the resulting method is a competent solution for extracting keywords from documents of languages lacking sophisticated NLP support.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call