Abstract

Background: Automatic keyphrase extraction (AKE) is essential to many NLP and information retrieval tasks. Extracting high-quality key phrases is difficult due to technological advancements and the exponential growth ‎of ‎textual data and digital sources. Unsupervised keyphrase extraction with cheap computing cost that relies ‎on ‎heuristic notions of phrase importance such as embedding similarities but their development necessitates in-depth subject expertise. Materials and Methods: This paper presents a method to obtain a semantic understanding of the query ‎and index documents by using ‎the embedding technique(universal Sentence ‎encoder (USE) ) while keeping the most informative ‎using Maximal Marginal Relevance (MMR) and then scoring(an ‎inverted index) the most documents relevant to the query ‎vector to improve the ‎performance of IR systems.‎ Results: The proposed retrieval model implement on the (Fire2011) dataset. The final ‎stage was evaluating the results of the baseline and the results (indexing and ranking) by using mean average precision (MAP). The ‎result of the baseline was 0.61, while the result ‎inverted index was 0.6277519 .‎ Conclusions: In this paper, we have discussed document retrieval using keep key phrases that ‎have ‏informativeness ‎properties by ‎using maximal ‎marginal ‎relevance, since if we extract a fixed number of top ‎keyphrases, ‎redundancy ‎hinders the diversification of the ‎extracted keyphrases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call