DETEKSI KEMIRIPAN ARTIKEL MELALUI KEYWORDS DENGAN METODE FUZZY STRING MATCHING DALAM NATURAL LANGUAGE PROCESSING

Humuntal Rumapea

doi:10.46880/jmika.vol5no1.pp60-66

Abstract

Natural Language Processing (NLP) is part of artificial intelligence that focuses on natural language processing. That is the language commonly used by humans in communicating with each other. In writing, articles can also be applied. To find out the similarity of the contents of a scientific article to another will make it easier for readers to make selections and make it easier to collect similar documents. Likewise with differences in article content even though they use the same keywords. Measurements are made using the keywords contained in the abstract which consists of several words and the number of keywords. The method used is to use fuzzy string matching to get the number of keywords used in an article. Then the calculation will be done for each keyword used and will be sorted by priority according to the position of the keyword. The search will be carried out starting from the title, abstract, keywords, content, and references. The number of keywords found in each article is shared across all keywords found to generate a similarity percentage level. If the results found are the same in number from one article to another, it can even be categorized that the contents of the article have similar content. The testing process is carried out by counting the number of keywords in the source then comparing all the source keys to the destination articles in the database by searching and comparing each word.

Full Text