Abstract

Extracting domain knowledge and taking its full advantage has been an important way to reducing costs and accelerating processes in domain-related applications. Domain ontology, providing a common and unambiguous understanding of a domain for both the users and the system to communicate with each other via a set of representational primitives, has been proposed as an important and natural approach to represent domain knowledge. Most domain knowledge about domain entities with their properties and relationships is embodied in document collections. Thus, extracting ontologies from these documents is an important means of ontology construction. In this paper, a graph-based approach for automatic construction of domain ontology from domain corpus, named GRAONTO, has been proposed. First, each document in the collection is represented by a graph. After the generation of document graphs, random walk term weighting is employed to estimate the relevance of the information of a term to the corpus from both local and global perspectives. Next, the MCL (Markov Clustering) algorithm is used to disambiguate terms with different meanings and group similar terms to produce concepts. Next, an improved gSpan algorithm constrained by both vertices and informativeness is exploited to find arbitrary latent relations among these concepts. Finally, the domain ontology is output in the OWL format. For ontology evaluation purposes, a method for adaptive adjustment of concepts and relations with respect to its practical effectiveness is conceived. Evaluation experiments show that GRAONTO is a promising approach for domain ontology construction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call