Optimizing Feedforward Neural Networks Using Biogeography Based Optimization for E-Mail Spam Identification

Ali Rodan,Hossam Faris,Ja’Far Alqatawna

doi:10.4236/ijcns.2016.91002

Abstract

Spam e-mail has a significant negative impact on individuals and organizations, and is considered as a serious waste of resources, time and efforts. Spam detection is a complex and challenging task to solve. In literature, researchers and practitioners proposed numerous approaches for automatic e-mail spam detection. Learning-based filtering is one of the important approaches used for spam detection where a filter needs to be trained to extract the knowledge that can be used to detect the spam. In this context, Artificial Neural Networks is a widely used machine learning based filter. In this paper, we propose the use of a common type of Feedforward Neural Network called Multi-Layer Perceptron (MLP) for the purpose of e-mail spam identification, where the weights of this network model are found using a new nature-inspired metaheuristic algorithm called Biogeography Based Optimization (BBO). Experiments and results based on two different spam datasets show that the developed MLP model trained by BBO gets high generalization performance compared to other optimization methods used in the literature for e-mail spam detection.

Highlights

Spam can be defined as a form of unwanted communications usually sent in a large volume that negatively affects networks bandwidth, servers storage, user time and work productivity [1]-[4]
We develop an Multilayer Perceptron (MLP) neural network model trained with the Biogeography based Optimization (BBO) [22] for identifying e-mail spam
This paper is organized as follows: Section 2 gives a broad description of Multilayer Perceptron (MLP) Neural Networks; Section 3 introduces and exposes how to perform Biogeography based optimization (BBO); BBO for training MLP is described in Section 4; The Datasets are described in Section 5; Section 6 exposes the experiments and analyzes the results obtained; and Section 7 outlines some conclusions

Summary

Introduction

Spam can be defined as a form of unwanted communications usually sent in a large volume that negatively affects networks bandwidth, servers storage, user time and work productivity [1]-[4]. This requires a large email dataset with both spam and legitimate ones Most of these filters use Machines Learning (ML) algorithms such as Naive Bayes Classifier [11], Support Vector Machines [12] and Artificial Neural Networks [1]. We develop an MLP neural network model trained with the Biogeography based Optimization (BBO) [22] for identifying e-mail spam. This paper is organized as follows: Section 2 gives a broad description of Multilayer Perceptron (MLP) Neural Networks; Section 3 introduces and exposes how to perform Biogeography based optimization (BBO); BBO for training MLP is described in Section 4; The Datasets are described in Section 5; Section 6 exposes the experiments and analyzes the results obtained; and Section 7 outlines some conclusions

Multilayer Perceptron Neural Networks

Biogeography Based Optimization

BBO for Training MLP

Datasets

Experiments and Results

Conclusion