Graph Based Document Model and its Application in Keyphrase Extraction

Sincy V Thambi,P C Reghuraj

doi:10.1109/spices52834.2022.9774141

Abstract

Graph-based document models, rather than the traditional Bag of Words (BOW) model, have opened the way to efficiently represent documents by capturing syntactic and semantic characteristics from the document. Because a graph is a mathematical model, it may effectively capture the best properties from documents and be utilized for a variety of Natural Language Processing (NLP) applications, as well as simplifying other computations. In a graph model, document terms are represented as nodes, and relationships between those terms are captured and joined together to produce edges, with a weightage based on the extent of the relationship. This work presents a systematic survey of existing graph models, with a focus on node selection, edge selection, and edge weighting factors that are appropriate for diverse NLP applications. Key extraction based on graph models, as well as various ranking algorithms, are also examined. This study shows that graph-based document models play an important role in representing textual documents and assisting with complex NLP tasks.

Full Text