Abstract

Stemming is a process contained in the pre-processing stage that recognizes basic words (stem word) by combining or solving each of the variants of a word. Every language is unique, the most popular stemming algorithm for Indonesian text is Nazief-Adriani algorithm. Therefore, this study aims to compare Nazief-Adriani algorithm with another stemming algorithm for Indonesian text, that is Paice-Husk stemming algorithm which is commonly used for English. Beside, Nazief-Adriani and Paice-Husk algorithm for stemming process, this study use McCabe Cyclometic Complexity Metrix to evaluate the complexity of algorithm. Based on the experiment result with 20 sentences as data with a thousand words, the accuracy of the Nazief-Adriani algorithm is better than the Paice-Husk algorithm, which is 91.87% compared to 64.43%. Likewise, in terms of complexity, the algorithm is still more complex Paice-Husk than Nazief-Adriani. However, in terms of processing time, the Paice-Husk algorithm is slightly faster than the Nazief-Adriani algorithm. These results indicate that the Paice-Husk algorithm requires a more complete implementation of Indonesian morphological and grammatical rules to produce the better Indonesian stem words.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call