Abstract
The Internet has become open, public and widely used as a source of data transmission and exchanging messages between criminals, terrorists and those who have illegal motivations. Moreover, it can be used for exchanging important data between various military and financial institutions, or even ordinary citizens. One of the important means of exchanging information widely used on the Internet medium is the e-mail. Email messages are digital evidence that has been become one of the important means to adopt by courts in many countries and societies as evidence relied upon in condemnation, that prompts the researchers to work continuously to develop email analysis tool using the latest technologies to find digital evidence from email messages to assist the forensic expertise into to analyze email groups .This work presents a distinct technique for analyzing and classifying emails based on data processing and extraction, trimming, and refinement, clustering, then using the SWARM algorithm to improve the performance and then adapting support vector machine algorithm to classify these emails to obtain practical and accurate results. This framework, also proposes a hybrid English lexical Dictionary (SentiWordNet 3.0) for email forensic analysis, it contains all the sentiwords such as positive and negative and can deal with the Machine Learning algorithm. The proposed system is capable of learning in an environment with large and variable data. To test the proposed system will be select available data which is Enron Data set. A high accuracy rate is 92% was obtained in best case. The experiment is conducted the Enron email dataset corpus (May 7, 2015 Version of the dataset).
Highlights
Email appears as a very important application on the Internet for data communication, which is utilized by computers and by numerous electronic devices [1], such that it is a common way to communicate between parties and it transfers information between servers on a specified port number
Email messages are digital evidence which have become one of the important means to adopt by courts in many countries and societies as evidence relied upon in condemnation
The experiments of this work have been implemented using the environment with the following specifications: Windows 10, Intel(R) Core(TM) i5- 4200U CPU@1.60GHz 2.29 GHz, RAM 8GB and 64- bit system type, the proposed system is programmed in Java Language platform on NetBeans IDE 8.2, Tool: Wampserver to handle MySQL database and used SentiWordNet 3.0 and Stanford
Summary
Email appears as a very important application on the Internet for data communication, which is utilized by computers and by numerous electronic devices [1], such that it is a common way to communicate between parties and it transfers information between servers on a specified port number. Data Mining is an application of algorithms to extract patterns of information and to make the useful information available in management and has a number of applications in Digital cyber forensics It includes discovering and classification the forensic information in groups based on relationships, identifying relationships in forensic association, detects patterns in information that leads to helpful forecasting and detects groups of hidden facts [7]. These machine learning techniques have been widely used to extract evidence from large email groups.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Iraqi Journal of Information & Communications Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.