Abstract

Eventhough, conventional technologies are quiet good in separating spam messages, still soo many measures have to be considered to make more accuracy in spam filtering. In this work, we worked towards detecting spam mails and filtering it during its transmission. We proposed Collaborative filtering approach hybrid with text classification (semantics based). The related feature are retrieved from the text content. Also, another filtering method known as Content-based filtering is proposed which filters the same spam mail with more precision and better accuracy. Along with the semantic texts the Content-based filtering filters the special symbols such as HTML tags, @,/etc. Results are compared and the accuracy of detecting spam e-mails of Content-based filters is more than that of Collaborative filters. Both Collaborative and Content-based filters perform keyword check available in the spam keyword database and detects whether the mail sent by the sender is genuine or spam. Genuine emails are sent successfully and the spam emails are blocked at the server side. Content-based email classification requires an understanding of both structural and semantic attributes of email. Conventional research is focussed on semantic properties through structural components of email. After analysing the emails as events (as a major subset of the class of email), a rich contextual test-bed representation for an understanding of the semantic attributes of emails has been devised. The event-based emails have traditionally been studied based on simple structural properties.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.