Abstract
Digital forensics is the study of recovery and investigation of the materials found in digital devices, mainly in computers. Forensic authorship analysis is a branch of digital forensics. It includes tasks such as authorship attribution, authorship verification, and author profiling. In Authorship verification, with a given a set of sample documents D written by an author A and an unknown document d, the task is to find whether document d is written by A or not. Authorship verification has been previously done using genetic algorithms, SVM classifiers, etc. The existing system creates an ensemble model by combining the features based on the similarity scores, and the parameter optimization was done using a grid search. The accuracy of verification using the grid search method is 62.14%. The time complexity is high as the system tries all possible combinations of the features during the ensemble model's construction. In the proposed work, Modified Particle Swarm Optimization (MPSO) is used to construct the classification model in the training phase, instead of the ensemble model. In addition to the combination of linguistic and character features, Average Sentence Length is used to improve the verification task accuracy. The accuracy of verification has been improved to 63.38%.
Highlights
Digital forensics locates the evidence located on computers, mobile phones, and networks [1]
The process of determining whether a given unknown document x is written by the same author who had written the given set of known documents D is known as Authorship Verification (AV)
The classification model generated using the Modified Particle Swarm Optimization (MPSO) algorithm has the highest accuracy compared to the ensemble model generated by the grid search, the results are shown in table 1
Summary
Digital forensics locates the evidence located on computers, mobile phones, and networks [1]. Tasks in digital forensics include Authorship Verification (AV), Author clustering, and Text classification. Text classification is the task of classifying a document into one or more classes. It can be done either manually or algorithmically. The process of determining whether a given unknown document x is written by the same author who had written the given set of known documents D is known as Authorship Verification (AV). It can be viewed either as a mono-class classification or as a multi-class classification task. Created model is utilized to anticipate the origin of the obscure reports
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have