Abstract

Uncovering a concept hierarchy from short texts, such as tweets and instant messages, is critical for helping users quickly understand the main concepts and sub-concepts in large volumes of such texts. However, due to the sparsity of short texts, existing hierarchical models fail to learn the structural relations among concepts and discover the data more deeply. To solve this problem, we introduce a new notion called context coherence. Context coherence reflects the coverage of a word in a collection of short texts. This coverage is measured by analyzing the relations of words in whole texts. The major advantage of context coherence is that it aligns with the requirements of a concept hierarchy and can lead to a meaningful structure. Moreover, we propose a novel non-parametric context coherence-based model (CCM) that can discover the concept hierarchy from short texts without a pre-defended hierarchy depth and width. We evaluate our model on two real-world datasets. The quantitative evaluations confirm the high quality of the concept hierarchy discovered by our model compared with those of state-of-the-art methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call