Abstract
Microblogging platforms have emerged as large collections of short documents. In fact, the provision of an effective way to retrieve short text presents a significant research challenge owing to several factors: creative language usage, high contextualization, the informal nature of micro blog posts and the limited length of this form of communication. Thus, micro blogging retrieval systems suffer from the problems of data sparseness and the semantic gap. This makes it inadequate to accurately meet users’ information needs because users compose tweets using few terms and without query terms inside; thus, many relevant tweets will not be retrieved. To overcome the problems of data sparseness and the semantic gap, recent studies on content-based microblog searching have focused on adding semantics to micro posts by linking short text to knowledge bases resources. Moreover, previous studies use bag-of-concepts representation by linking named entities to their corresponding knowledge base concepts. However, bag-of-concepts representation considers only concepts that match named entities and supposes that all concepts are equivalent and independent. Thus, in this paper, we present a graph-of-concepts method that considers the relationships among concepts that match named entities in short text and their related concepts and contextualizes each concept in the graph by leveraging the linked nature of DBpedia as a Linked Open Data knowledge base and graph-based centrality theory. Furthermore, we propose a similarity measure that computes the similarity between two graphs (query-tweet) by considering the relationships between the contextualized concepts. Finally, we introduce some experiment results, using a real Twitter dataset, to expose the effectiveness of our system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.