Email Spam Classification Research Articles

Accreditation can be interpreted as an effort by the government to standardize and guarantee the quality of college alumni so that the quality of verification between universities is not too varied and in accordance with work needs. SAPTO or Online Higher Education Accreditation System is a system organized by BAN-PT for the online higher education accreditation process. Developed to improve the efficiency and quality of higher education accreditation processes. At Sapto, the University acts as an entity that submits accreditation proposals for both the Higher Education Accreditation and the Study Program Accreditation. BAN-PT has approved a complaint service related to technical issues in SAPTO that can be addressed to the hotline-sapto email account. BAN-PT has a question and answer service through the e-mail hotline sapto which can be used by universities to facilitate related to the accreditation process with sapto. Submission and questions about the accreditation process with Sapto are still responded more quickly by staff who work as public relations BAN-PT. This relates to direct question and answer questions and also a lot of time spent reading or sending spam messages as well as unwanted irrelevant questions. Email technology is also used a lot not for positive purposes so as to benefit from spam email. On this occasion the research that will be conducted is the classification of spam emails from the hotline-sapto account and preprocessing and the calculation of its accuracy, AUC with various data mining classification methods, including the Naive Bayes algorithm, Support Vector Machine (SVM), this method is used to predict spam emails with that is the purpose of the algorithm chosen is the most accurate algorithm that can predict spam emails. From the test results obtained by the calculation of the SVM method with PSO get an accuracy value of 85.25% with AUC of 0.892

Read full abstract

PurposeEmail is a rapid and cheapest medium of sharing information, whereas unsolicited email (spam) is constant trouble in the email communication. The rapid growth of the spam creates a necessity to build a reliable and robust spam classifier. This paper aims to presents a study of evolutionary classifiers (genetic algorithm [GA] and genetic programming [GP]) without/with the help of an ensemble of classifiers method. In this research, the classifiers ensemble has been developed with adaptive boosting technique.Design/methodology/approachText mining methods are applied for classifying spam emails and legitimate emails. Two data sets (Enron and SpamAssassin) are taken to test the concerned classifiers. Initially, pre-processing is performed to extract the features/words from email files. Informative feature subset is selected from greedy stepwise feature subset search method. With the help of informative features, a comparative study is performed initially within the evolutionary classifiers and then with other popular machine learning classifiers (Bayesian, naive Bayes and support vector machine).FindingsThis study reveals the fact that evolutionary algorithms are promising in classification and prediction applications where genetic programing with adaptive boosting is turned out not only an accurate classifier but also a sensitive classifier. Results show that initially GA performs better than GP but after an ensemble of classifiers (a large number of iterations), GP overshoots GA with significantly higher accuracy. Amongst all classifiers, boosted GP turns out to be not only good regarding classification accuracy but also low false positive (FP) rates, which is considered to be the important criteria in email spam classification. Also, greedy stepwise feature search is found to be an effective method for feature selection in this application domain.Research limitations/implicationsThe research implication of this research consists of the reduction in cost incurred because of spam/unsolicited bulk email. Email is a fundamental necessity to share information within a number of units of the organizations to be competitive with the business rivals. In addition, it is continually a hurdle for internet service providers to provide the best emailing services to their customers. Although, the organizations and the internet service providers are continuously adopting novel spam filtering approaches to reduce the number of unwanted emails, the desired effect could not be significantly seen because of the cost of installation, customizable ability and the threat of misclassification of important emails. This research deals with all the issues and challenges faced by internet service providers and organizations.Practical implicationsIn this research, the proposed models have not only provided excellent performance accuracy, sensitivity with low FP rate, customizable capability but also worked on reducing the cost of spam. The same models may be used for other applications of text mining also such as sentiment analysis, blog mining, news mining or other text mining research.Originality/valueA comparison between GP and GAs has been shown with/without ensemble in spam classification application domain.

Read full abstract

Email Spam Classification Research Articles

Related Topics

Articles published on Email Spam Classification

Detection and Classification of Legitimate and Spam Emails using K-Nearest Neighbor Augmented with Quadratic Sieve Algorithm

Hybrid Feature Selection and Ensemble Learning Method for Spam Email Classification

Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification

Email Prioritization Using Machine Learning

Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory

Klasifikasi Algoritma Naïve Bayes dan SVM Berbasis PSO Dalam Memprediksi Spam Email Pada Hotline-Sapto

A Machine Learning Based Email Spam Classification Framework Model: Related Challenges and Issues

A lifelong spam emails classification model

Effect of Header-based Features on Accuracy of Classifiers for Spam Email Classification

Predictive analytics for spam email classification using machine learning techniques

Predictive analytics for spam email classification using machine learning techniques

Data analytics for network intrusion detection

A study of boosted evolutionary classifiers for detecting spam

Spam e-mail classification for the Internet of Things environment using semantic similarity approach

The psychological interaction of spam email features

A Survey on Various Machine Learning and Deep Learning Algorithms used for Classification of Spam and Non-Spam Emails

Detecting Phishing Attack and Spam Email Classification

A comparative study of text mining in big data analytics using deep learning and other machine learning algorithms

A comparative study of text mining in big data analytics using deep learning and other machine learning algorithms

Antlion optimization and boosting classifier for spam email detection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Email Spam Classification Research Articles

Related Topics

Articles published on Email Spam Classification

Detection and Classification of Legitimate and Spam Emails using K-Nearest Neighbor Augmented with Quadratic Sieve Algorithm

Hybrid Feature Selection and Ensemble Learning Method for Spam Email Classification

Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification

Email Prioritization Using Machine Learning

Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory

Klasifikasi Algoritma Naïve Bayes dan SVM Berbasis PSO Dalam Memprediksi Spam Email Pada Hotline-Sapto

A Machine Learning Based Email Spam Classification Framework Model: Related Challenges and Issues

A lifelong spam emails classification model

Effect of Header-based Features on Accuracy of Classifiers for Spam Email Classification

Predictive analytics for spam email classification using machine learning techniques

Predictive analytics for spam email classification using machine learning techniques

Data analytics for network intrusion detection

A study of boosted evolutionary classifiers for detecting spam

Spam e-mail classification for the Internet of Things environment using semantic similarity approach

The psychological interaction of spam email features

A Survey on Various Machine Learning and Deep Learning Algorithms used for Classification of Spam and Non-Spam Emails

Detecting Phishing Attack and Spam Email Classification

A comparative study of text mining in big data analytics using deep learning and other machine learning algorithms

A comparative study of text mining in big data analytics using deep learning and other machine learning algorithms

Antlion optimization and boosting classifier for spam email detection