Ontology represents a structured view of the domain containing rich semantic meanings, thus plays an important role for various knowledge-intensive applications. Domain ontology contains the concept of complete information and extensive links between concepts. Construction of agricultural domain ontology can provide the foundation of knowledge organization for vertical agricultural research engine and promoting agricultural informationization and realizing cooperative service of agricultural information. Semantic similarity measure plays an important role in information retrieval and information integration based on ontology. While, traditional semantic similarity algorithms of domain ontology only focus on single influencing factor, which leads to poor convergence performances, lower accuracy, strong subjectivity and other defects. In this paper, a weighted semantic similarity algorithm based on agricultural domain ontology was proposed. According to the characteristics of the different structure of ontology, the major factors that influence the similarity are the structure factor, property of the concept, information, etc. But the structural factors are impacted by relationship type, density of node, depth of node and other factors. First, according to the structural characteristics of ontology model, a new method for calculating the node density was proposed in this study. At the same time, an integrated structure similarity model based on relationship type, node density, depth integrated structure similarity, semantic distance was given, which was called the structure factors. Second, according to the literature and empirical knowledge, the property grid of ontology concept pairs was accessed to gain the attributes of the concept. Third, according to ontology hierarchial network, B-U probability based on root and leaf nodes and semantic information was calculated, which did not rely on the expertise and was objective. Fourth, combining semantic structure, information and property factors, an integrated semantic similarity algorithm was proposed, which considered that different impact factors had a different important degree in the calculation of semantic similarity and were given different weights to agricultural ontology relations. Finally, taking semantic similarity computation of part of agriculture ontology for example, the calculation process of semantic similarity on sweet corn and waxy maize was enumerated in detail. According to the semantic similarity algorithm proposed in this paper, comparing the calculation results of semantic similarity (0.8206) and standard deviation (0.0565) with other algorithms, it was closer to the intuitive cognition and expert advice, which can effectively improve the accuracy and validity of semantic similarity computation. In this paper, we presented the effort of computing the semantic similarity values via studying the relationship between concept pairs of agricultural ontologies at different depth of ontology hierarchical structure. We evaluated the accuracy of semantic similarity calculation between four different algorithms (algorithm in this paper, algorithm based on information content, algorithm based on distance, and algorithm using standard deviation commonly applied in statistics). The results of this study demonstrated that with proper selection of parameters and comprehensive similarity computation measures, we can significantly reduce difficulty of distinguishing the concept of weak correlation. This study provides a deeper understanding of the application of semantic similarity to agricultural ontologies, and shows how to choose appropriate semantic similarity measures for agricultural information retrieval. © 2016, Editorial Department of the Transactions of the Chinese Society of Agricultural Engineering. All right reserved.
Read full abstract