Abstract
Objectives: To identify ontology concepts from text documents for the construction of work process ontology. Methods: This study proposes a methodology to identify terms representing a work process ontology concept from a document. The methodology encompasses several key steps: document collecting, text pre-processing, term weighting and analysis, terms mapping, and domain expert relevance judgments. A comparison between the results of three different term weighting schemes, namely the Term Frequency (TF), Term Frequency-Inverse Document Frequency (TFIDF), and Mutual Information (MI) is made with the ontology concept that the domain expert has judged. Findings: The approaches adopted in this study have managed to extract ontological concepts from the targeted domain knowledge source. The findings of the comparison analysis suggest that the TFIDF term weighting scheme exhibits better results compared to the TF and MI weighting schemes. Novelty: A work process ontology is a structured knowledge describing daily operations in the government sector. However, there has been little to no effort in building the work process ontology. This study presents an integrated approach for identifying ontology concepts from documents within the domain of the work process. To the utmost extent of our understanding, this research initiative is the initial attempt to introduce a structured methodology for the semi-automatic extraction and evaluation of concepts and relationships within this domain. The findings can be utilised as a foundation for developing an ontology in the specific field. Keywords: Ontology, Work process, Text extraction, Natural language processing, Term weighting
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.