Abstract

Comprehending social media discussions in short text microblogs is fundamental for knowledge-based applications like recommender systems. Twitter, for example, provides rich real-time information in keeping with its streaming nature. Making sense of such data without automated support is not feasible due to its vast size and nature. The problem becomes more complex when the data in question have a low variance in terms of topical diversity. Therefore, an automatic method for understanding textual patterns in such topically constrained data needs to be developed. A major challenge to building such a system is in its ability to comprehend the nature of the data with regard to diversity of word structure correlations, vocabulary sparsity, and distinguishing factors in the generated topics. In this paper, we present a novel semi-supervised approach called metamodel enabled latent Dirichlet allocation to address this challenge. Compared to state-of-the-art approaches, our model incorporates a domain-specific metamodel. The metamodel is defined as a set of topic label vectors derived from long texts to guide the learning process in shorter texts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.