Penerapan Algoritma TF-IDF dan Cosine Similarity untuk Query Pencarian Pada Dataset Destinasi Wisata

Rio Al Rasyid,Dewi Handayani Untari Ningsih

doi:10.35870/jtik.v8i1.1416

Abstract

This research aims to improve the search for tourist destinations in 50 datasets by using search queries to find relevant documents. By optimizing the search process, the goal is to create an accurate list of tourist destinations based on a given query. To achieve this, researchers used the TF-IDF and Cosine Similarity algorithms to retrieve and compare information, measuring similarity scores between search queries and tourist destinations in the dataset. Finally, the list of tourist destinations is ranked based on the similarity score measurement. The methods used are TF-IDF and Cosine Similarity. The fifty datasets containing text content documents were normalized through pre-processing stages, namely Case Folding, Stopword Removal, and Tokenization. Documents that have been normalized are then processed again through TF-IDF weighting. TF-IDF weighting is also applied to search queries. The similarity calculation between the TF-IDF vector from the document and the TF-IDF vector from the search query is carried out using Cosine Similarity to obtain a similarity score for each document based on the search query. Testing was carried out on 5 different queries, and precision testing results were obtained with an average value of 83%

Full Text