Abstract

Named Entity Recognition (NER) research based on rule is widely investigated and is used in various languages mainly English. However, the English NER rules are different with Malay language due to different morphology. Some of challenging issue in Malay is cross reference between named entities, and entity repetition. This paper proposes to solve the issues in Malay NER. This study starts by providing Malay online news corpus, gazeteer development, rules development and evaluation. This study focus on nine name entities i.e person, organization, position, date, time, currency, measurement and percentage. Overall the experimental result shows 90.23% precision, 92.13% recall and 91.05% f-measure. The outcome from this research is expected to help other researchers in implementing the Malay NER using rule based approach through the addition of new rules to achieve higher accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call