Abstract

In this paper, we present the development of a Bilingual Sentiment Analysis Lexicon (BiSAL) for cyber security domain, which consists of a Sentiment Lexicon for ENglish (SentiLEN) and a Sentiment Lexicon for ARabic (SentiLAR) that can be used to develop opinion mining and sentiment analysis systems for bilingual textual data from Dark Web forums. For SentiLEN, a list of 279 sentiment bearing English words related to cyber threats, radicalism, and conflicts are identified and a unifying process is devised to unify their sentiment scores obtained from four different sentiment data sets. Whereas, for SentiLAR, sentiment bearing Arabic words are identified from a collection of 2000 message posts from Alokab Web forum, which contains radical contents. The SentiLAR provides a list of 1019 sentiment bearing Arabic words related to cyber threats, radicalism, and conflicts along with their morphological variants and sentiment polarity. For polarity determination, a semi-automated analysis process by three Arabic language experts is performed and their ratings are aggregated using some aggregate functions. A Web interface is developed to access both the lexicons (SentiLEN and SentiLAR) of BiSAL data set online, and a beta version of the same is available at http://www.abulaish.com/bisal.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.