A syntactic characterization of authorship style surrounding proper names

Ana Lučić,Catherine L Blake

doi:10.1093/llc/fqt033

Abstract

Accurately determining who wrote a manuscript has captivated scholars of literary history for centuries, as the true author can have important ramifications in religion, law, literary studies, philosophy, and education. A wide array of lexical, character, syntactic, semantic, and application-specific features have been proposed to represent a text so that authorship attribution can be established automatically. Although surface-level features have been tested extensively, few studies have systematically explored high-level features, in part due to limitations in the natural language processing techniques required to capture high-level features. However, high-level features, such as sentence structure, are used subconsciously by a writer and thus may be more consistent than surface-level features, such as word choice. In this article, we introduce a new high-level feature based on local syntactic dependencies that an author uses when referring to a named entity (in our case a person’s name). The series of experiments in the contexts of movie reviews reveal how the amount of data in both the training and test sets influences predictive performance. Finally, we measure authorship consistency with respect to this new feature and show how consistency influences predictive performance. These results provide other researchers with a new model for how to evaluate new features and suggest that the local syntactic dependencies warrant further investigation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A syntactic characterization of authorship style surrounding proper names

Abstract

Talk to us

Similar Papers

More From: Digital Scholarship in the Humanities

Lead the way for us

Journal: Digital Scholarship in the Humanities	Publication Date: Jun 29, 2013
Citations: 5

Similar Papers

The Colores texts of Haneron and Schut
A.M Coeberg Van Den Braak
Quaerendo | VOL. 23
A.M Coeberg Van Den BraakA.M Coeberg Van Den Braak
01 Jan 1992
Quaerendo | VOL. 23

Instance Based Authorship Attribution for Kannada Text Using Amalgamation of Character and Word N-grams Technique
C P Chandrika ... Jagadish S Kallimani
-
C P Chandrika, et. al.C P Chandrika ... Jagadish S Kallimani
01 Jan 2021
01 Jan 2021

Arabic Poetry Authorship Attribution using Machine Learning Techniques
Al-Falahi Ahmed ... Ramdani Mohamed
Journal of Computer Science | VOL. 15
Al-Falahi Ahmed, et. al.Al-Falahi Ahmed ... Ramdani Mohamed
01 Jul 2019
Journal of Computer Science | VOL. 15

Large-Scale LiDAR SLAM with Factor Graph Optimization on High-Level Geometric Features.
Krzysztof Ćwian ... Michał R Nowicki
Sensors (Basel, Switzerland) | VOL. 21
Krzysztof Ćwian, et. al.Krzysztof Ćwian ... Michał R Nowicki
15 May 2021
Sensors (Basel, Switzerland) | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A syntactic characterization of authorship style surrounding proper names

Abstract

Talk to us

Similar Papers

More From: Digital Scholarship in the Humanities