Learning non-taxonomical semantic relations from domain texts

Janardhana Punuru,Jianhua Chen

doi:10.1007/s10844-011-0149-4

Abstract

Ontology of a domain mainly consists of concepts, taxonomical (hierarchical) relations and non-taxonomical relations. Automatic ontology construction requires methods for extracting both taxonomical and non-taxonomical relations. Compared to extensive works on concept extraction and taxonomical relation learning, little attention has been given on identification and labeling of non-taxonomical relations in text mining. In this paper, we propose an unsupervised technique for extracting non-taxonomical relations from domain texts. We propose the VF*ICF metric for measuring the importance of a verb as a representative relation label, in much the same spirit as the TF*IDF measure in information retrieval. Domain-relevant concepts (nouns) are extracted using techniques developed earlier. Candidate non-taxonomical relations are generated as (SVO) triples of the form (subject, verb, object) from domain texts. A statistical method with log-likelihood ratios is used to estimate the significance of relationships between concepts and to select suitable relation labels. Texts from two domains, the Electronic Voting (EV) domain texts and the Tenders and Mergers (TNM) domain texts are used to compare our method with one of the existing approaches. Experiments showed that our method achieved better performance in both domains.

Full Text