A two level learning model for authorship authentication.

Ahmed Taha,Heba M Khalil,Tarek El-Shishtawy

doi:10.1371/journal.pone.0255661

Abstract

Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier’s results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author’s writing styles in numerical forms. Through this work, many new features are proposed for identifying the author’s writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%.

Highlights

Forensic authorship authentication means to detect the principal author of an unknown article [1]
The main idea is that each author has a writing style that is different from one to another [2]
Throughout this section, we present a review of the approaches for Arabic authorship authentication including machine learning-based authorship authentication and various types of stylometric features

Summary

Introduction

Forensic authorship authentication means to detect the principal author of an unknown article [1]. The main idea is that each author has a writing style that is different from one to another [2] This is because some authors’ uncontrollable behaviours and writing styles have shown to be successful over time. The instance-based approach subsequently extracts the features of writing style from each article for each author. It allows catching any variation in the style of writing. Profile-based methods extract writing features by concatenating all the articles belonging to a specific author in a large file. This method helps to identify the most uncontrolled behaviors and characterized features of the author’s writing style. A mixture of both directions is proposed to improve the authentication process’s performance

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A two level learning model for authorship authentication.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Journal: PloS one	Publication Date: Aug 5, 2021
License type: CC BY 4.0

Similar Papers

Automatic text summarization with statistical and linguistic features using successive thresholds
E Padmalahari ... Shiva Prasad
-
E Padmalahari, et. al.E Padmalahari ... Shiva Prasad
01 May 2014
01 May 2014

Feature Selection and Extraction for Dogri Text Summarization
Sonam Gandotra ... Bhavna Arora
-
Sonam Gandotra, et. al.Sonam Gandotra ... Bhavna Arora
02 Oct 2020
02 Oct 2020

Reading Between the Lines: Machine Learning Ensemble and Deep Learning for Implied Threat Detection in Textual Data
Muhammad Owais Raza ... Asadullah Shaikh
International Journal of Computational Intelligence Systems | VOL. 17
Muhammad Owais Raza, et. al.Muhammad Owais Raza ... Asadullah Shaikh
15 Jul 2024
International Journal of Computational Intelligence Systems | VOL. 17

Application of Computational Linguistics to Predict Language Proficiency Level of Persian Learners’ Textbooks

-

09 Feb 2021
09 Feb 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A two level learning model for authorship authentication.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one