Abstract
In general, we have utilized the hierarchical concept tree as a crucial data structure for indexing huge amount of textual data. This paper proposes a generality rank-based method that can automatically develop hierarchical concept structures with the Wikipedia data. The goal of the method is to regard each of Wikipedia articles as a concept and to generate hierarchical relationships among concepts. In order to estimate the generality of concepts, we have devised a special ranking function that mainly uses the number of hyperlinks among Wikipedia articles. The ranking function is effectively used for computing the probabilistic subsumption among concepts, which allows to generate relatively more stable hierarchical structures. Eventually, a set of concept pairs with hierarchical relationship is visualized as a DAG (directed acyclic graph). Through the empirical analysis using the concept hierarchy of Open Directory Project, we proved that the proposed method outperforms a representative baseline method and it can automatically extract concept hierarchies with high accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.