Semantic based entity retrieval and disambiguation system for Twitter streams

Senthil Kumar Narayanasamy ,M Dinakaran

doi:10.34105/j.kmel.2019.11.014

Abstract

Social media networks have evolved as a large repository of short documents and gives the greater challenges to effectively retrieve the content out of it. Many factors were involved in this process such as restricted length of a content, informal use of language (i.e., slangs, abbreviations, styles, etc.) and low contextualization of the user generated content. To meet out the above stated problems, latest studies on context-based information searching have been developed and built on adding semantics to the user generated content into the existing knowledge base. And also, earlier, bag-of-concepts has been used to link the potential noun phrases into existing knowledge sources. Thus, in this paper, we have effectively utilized the relationships among the concepts and equivalence prevailing in the related concepts of the selected named entities by deriving the potential meaning of entities and find the semantic similarity between the named entities with three other potential sources of references (DBpedia, Anchor Texts and Twitter Trends).

Highlights

Searching on the micro blogging system has been heavily suffered with data sparseness and data redundancy
In order to bring out the semantic proximity between the set of ambiguous mentions from DBpedia and its candidate entity, we have measured the semantic similarity by considering the weight and the path exist between the connected nodes
Once the potential named entities have been identified from the Twitter datasets using any of the above three methods described, the crucial task would be to assign the extracted named entities into the predefined types of its classes such as person, product, geographical locations, time, company etc., Though many Information Retrieval (IR) techniques had been proposed for document processing in information retrieval (Ifrim, Shi, & Brigadir, 2014; Liang et al, 2014), it has failed to categorize the entities into its associated domains or classes and when it is extracted from unstructured text such as Twitter Streams

Summary

Introduction

Searching on the micro blogging system has been heavily suffered with data sparseness and data redundancy. To overcome the above problems, it is deemed to model the semantic based retrieval system which removes the ambiguity persists over the text (i.e. unstructured text) and links the entities in the text to the appropriate real-world entity sets It has brought into the focus of entity-based retrieval system over the micro blogging search operations and disambiguates the entities with the populated knowledge base ontologies (such as DBpedia, Freebase, YAGO, etc). It has encountered with many disambiguates which are persisting in large numbers and yields the contradictory results To shun those entity disambiguates, we proposed the three ways strategic approaches such as DBpedia based Semantic Measure, Anchor Text based Cosine Similarity and Twitter Popularity Trend Detection to effectively filter out the disambiguated entities and mapped exactly to the given tweet(s) context. We have preferred this topic for empirical analysis since it has attained huge reach and collected high volume of responses for the topic

Related works

Proposed semantic retrieval context

DBpedia based entity disambiguation

Entity labeling

Disambiguation pages

Redirect pages

Anchor text-based similarity measure

Twitter popularity-based trends measure

Classification of named entities

Empirical results

Conclusion

Findings

Further research directions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semantic based entity retrieval and disambiguation system for Twitter streams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Knowledge Management & E-Learning: An International Journal

Lead the way for us

Journal: Knowledge Management & E-Learning: An International Journal	Publication Date: Jun 28, 2019
License type: cc-by

Similar Papers

College Counselors' Use of Informal Language Online: Student Perceptions of Expertness, Trustworthiness, and Attractiveness
Shane Haberstroh
Cyberpsychology, Behavior, and Social Networking | VOL. 13
Shane HaberstrohShane Haberstroh
16 Feb 2010
Cyberpsychology, Behavior, and Social Networking | VOL. 13

Study of Language Change by the Promotion of Informal Language, Orthography and Words with Multiple Meanings on Facebook
Tayyaba Sharoof ... Shafaq Shakeel
Journal of English Language, Literature and Education | VOL. 2
Tayyaba Sharoof, et. al.Tayyaba Sharoof ... Shafaq Shakeel
18 Nov 2020
Journal of English Language, Literature and Education | VOL. 2

Acceptability of Arabic Reduplicates
Mohammad Anani
International Journal of Linguistics | VOL. 4
Mohammad AnaniMohammad Anani
11 Nov 2012
International Journal of Linguistics | VOL. 4

Linguistic Formality and Audience Engagement: Investors' Reactions to Characteristics of Social Media Disclosures*
Kristina M Rennekamp ... Patrick D Witz
Contemporary Accounting Research | VOL. 38
Kristina M Rennekamp, et. al.Kristina M Rennekamp ... Patrick D Witz
27 Apr 2021
Contemporary Accounting Research | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic based entity retrieval and disambiguation system for Twitter streams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Knowledge Management &amp; E-Learning: An International Journal

More From: Knowledge Management & E-Learning: An International Journal