Abstract
Ontology of a domain mainly consists of concepts, taxonomical (hierarchical) relations and non-taxonomical relations. Automatic ontology construction requires methods for extracting both taxonomical and non-taxonomical relations. Compared to extensive works on concept extraction and taxonomical relation learning, little attention has been given on identification and labeling of non-taxonomical relations in text mining. In this paper, we propose an unsupervised technique for extracting non-taxonomical relations from domain texts. We propose the VF*ICF metric for measuring the importance of a verb as a representative relation label, in much the same spirit as the TF*IDF measure in information retrieval. Domain-relevant concepts (nouns) are extracted using techniques developed earlier. Candidate non-taxonomical relations are generated as (SVO) triples of the form (subject, verb, object) from domain texts. A statistical method with log-likelihood ratios is used to estimate the significance of relationships between concepts and to select suitable relation labels. Texts from two domains, the Electronic Voting (EV) domain texts and the Tenders and Mergers (TNM) domain texts are used to compare our method with one of the existing approaches. Experiments showed that our method achieved better performance in both domains.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.