Extraction of Malay Root Word that Starts with Letter P in Malay e-Khutbah using Rule Based

Nurhilyana Anuar,Normaly Kamal Ismail,Zamri Abu Bakar

doi:10.15282/ijsecs.9.1.2023.4.0108

Nurhilyana Anuar, Normaly Kamal Ismail + Show 1 more

Open Access

https://doi.org/10.15282/ijsecs.9.1.2023.4.0108

Copy DOI

Abstract

Stemming is an important process in text processing especially in Natural Language Processing (NLP). It could extract root word from the affix words in the text. In addition, it helps in extracting useful information that contributes to many area of research study such as Information Retrieval. Several stemming algorithms have been discussed in previous studies. However, there are limited studies on Malay stemming process and the number of experimental data used. In this study, we focus on stemming process of Malay stemming algorithm by using rule-based algorithm for a larger dataset of Malay language text. The syntactic linguistic rule-based method was used in the stemming process involves of removing prefixes, suffixes and, prefixes and suffixes. Training dataset was used in this study which consisted of 3233 sentences from e-khutbah text. The result of the experimental evaluation was done by measuring the precision, recall and f-measure. It was found that the algorithm used in this study showed a promising result based on total of dataset used for each test. The value of precision, recall and F-measure increase to 95%, 97% and 97% respectively. The enhancement of the stemming process has shown a significant impact on Malay text processing which in general improved the performance of NLP applications.

Full Text