Abstract

Text feature representation is an important and fundamental problem widely studied in many text analysis tasks such as text classification. However, most of the existing methods on text feature extraction focus on text itself, for example, bag-of-words (BOW). In this work, we propose to make use of Knowledge Graphs (KGs) to enrich text representation in a novel HIN perspective. There are two main challenges due to the complexity of KGs. First, how to address the ambiguity when mapping the entities in a text to a KG. Second, how to incorporate the relations of entities in the same document, which indicate the intra-document semantics. To solve these problems, we present a novel Meta-Path Based Text Feature Enrichment (MeTEN) method. The MeTEN can effectively map nouns or noun phrases in a text to entities in a KG, and effectively discover their relations represented by meta paths in the KG through a novel bi-directional meta path generation algorithm. Extensive experiments on real-world datasets demonstrate that MeTEN can effectively enrich text feature and thus improve text classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call