Abstract

Semantic ontologies have been widely utilized as crucial tools within natural language processing, underpinning applications such as knowledge extraction, question answering, machine translation, text comprehension, information retrieval, and text summarization. While the Kurdish language, a low-resource language, has been the subject of some ontological research in other dialects, a semantic web ontology for the Badini dialect remains conspicuously absent. This paper addresses this gap by presenting a methodology for constructing and utilizing a semantic web ontology for the Badini dialect of the Kurdish language. A Badini annotated corpus (UOZBDN) was created and manually annotated with part-of-speech (POS) tags. Subsequently, an HMM-based POS tagger model was developed using the UOZBDN corpus and applied to annotate additional text for ontology extraction. Ontology extraction was performed by employing predefined rules to identify nouns and verbs from the model-annotated corpus and subsequently forming semantic predicates. Robust methodologies were adopted for ontology development, resulting in a high degree of precision. The POS tagging model attained an accuracy of 95.04% when applied to the UOZBDN corpus. Furthermore, a manual evaluation conducted by Badini Kurdish language experts yielded a 97.42% accuracy rate for the extracted ontology.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.