A Semantic and Syntactic Similarity Measure for Political Tweets

Claire Little,David Mclean,Bruce Edmonds,Keeley Crockett

doi:10.1109/access.2020.3017797

Claire Little, David Mclean + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.3017797

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 5	License type: CC BY 4.0

Affiliation: Manchester Metropolitan University

Abstract

Measurement of the semantic and syntactic similarity of human utterances is essential in allowing machines to understand dialogue with users. However, human language is complex, and the semantic meaning of an utterance is usually dependent upon the context at a given time and learnt experience of the meaning of the words that are used. This is particularly challenging when automatically understanding the meaning of social media, such as tweets, which can contain non-standard language. Short Text Semantic Similarity measures can be adapted to measure the degree of similarity of a pair of tweets. This work presents a new Semantic and Syntactic Similarity Measure (TSSSM) for political tweets. The approach uses word embeddings to determine semantic similarity and extracts syntactic features to overcome the limitations of current measures which may miss identical sequences of words. A large dataset of tweets focusing on the political domain were collected, pre-processed and used to train the word embedding model, with various experiments performed to determine the optimal model and parameters. A selection of tweet pairs were evaluated by humans for semantic equivalence and correlated against the measure. The new measure can be used in a variety of applications, including for identifying and analyzing political narratives. Experiments on three diverse human-labelled test datasets demonstrate that the measure outperforms an existing measure, performs well on tweets from the political domain and may also generalize outside the political domain.

Highlights

The ability to determine the similarity between two texts has applications in categorization, cluster analysis, dialogue systems, and document identification and matching
A large dataset of tweets pertaining to Brexit was collected and a word embedding model utilized to learn the semantic relationships between the words in the dataset
The Word Embedding Model (WEM) was utilized for the semantic element of the measure and the syntactic element considered sequences of words and features

Summary

Introduction

The ability to determine the similarity between two texts has applications in categorization, cluster analysis, dialogue systems, and document identification and matching. Large-scale social media data present challenges when it comes to automating these processes. The ability to automatically identify content on a particular theme, or find similar (or dissimilar) text, has many applications and may be crucial to understanding and identifying the various narratives on social media platforms. Twitter is a microblogging and social networking platform where users interact, and post messages known as tweets. Users may post their own tweets, ‘‘like’’ other users’ tweets, retweet (or share) tweets, and quote or reply to tweets.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Semantic and Syntactic Similarity Measure for Political Tweets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Sentence similarity measuring by vector space model
U. L. D. N. Gunasinghe ... W. A. D. Sashika
-
U. L. D. N. Gunasinghe, et. al.U. L. D. N. Gunasinghe ... W. A. D. Sashika
01 Dec 2014
01 Dec 2014

SISR: System for integrating semantic relatedness and similarity measures
Mohamed Ben Aouicha ... Mohamed Ali Hadj Taieb
Soft Computing | VOL. 22
Mohamed Ben Aouicha, et. al.Mohamed Ben Aouicha ... Mohamed Ali Hadj Taieb
21 Nov 2016
Soft Computing | VOL. 22

Using ontology for measuring semantic similarity for question answering system
Muthukrishnan Ramprasath ... Shanmugasundaram Hariharan
-
Muthukrishnan Ramprasath, et. al.Muthukrishnan Ramprasath ... Shanmugasundaram Hariharan
01 Aug 2012
01 Aug 2012

Analysis of tweets to find the basis of popularity based on events semantic similarity
Rajat Kumar Mudgal ... Alfredo Milani
International Journal of Web Information Systems | VOL. 14
Rajat Kumar Mudgal, et. al.Rajat Kumar Mudgal ... Alfredo Milani
27 Nov 2018
International Journal of Web Information Systems | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Semantic and Syntactic Similarity Measure for Political Tweets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access