Abstract
The identification of non-genuine or malicious messages poses a variety of challenges due to the continuous changes in the techniques utilised by cyber-criminals. In this article, we propose a hybrid detection method based on a combination of image and text spam recognition techniques. In particular, the former is based on sparse representation-based classification, which focuses on the global and local image features, and a dictionary learning technique to achieve a spam and a ham sub-dictionary. On the other hand, the textual analysis is based on semantic properties of documents to assess the level of maliciousness. More specifically, we are able to distinguish between meta-spam and real spam. Experimental results show the accuracy and potential of our approach.
Highlights
The ability of assessing malicious and non-genuine communication is crucial in all our activities, which are undeniably based on information sharing
A hybrid method to determine whether an unstructured dataset is of malicious nature is discussed
The main motivation is to provide a more flexible and accurate spam detection based on unstructured datasets
Summary
The ability of assessing malicious and non-genuine communication is crucial in all our activities, which are undeniably based on information sharing. Spam emails are certainly part of non-genuine communications, where a user receives unwanted emails on a variety of topics. Such type of communication can be used to hide another message, such as the type of communication shared by terrorist cells after the 9/11 attack [2]. The way terrorists attempted to share information is beyond the strict definition of spam. Their communication was hidden into a non-genuine message. We will use the terms “non-genuine message” and “spam” interchangeably, unless we wish to specify mutually exclusive features, and in such case, this would be clearly stated
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.