Abstract

Authorship studies has, over the last two decades, absorbed a number of quantitative methods only made possible through the use of computers. The New Oxford Shakespeare Authorship Companion presents a number of studies that utilize such methods, including some based in machine learning or “deep learning” models.This paper focuses on the specific application of three such methods in Jack Elliott and Brett Greatley-Hirsch’s “Arden of Faversham and the Print of Many.” It finds that their attribution of the authorship of Arden to William Shakespeare is suspect under all three such methods: Delta, Nearest Shrunken Centroid, and Random Forests. The underlying models do not sufficiently justify the attributions, the data provided are insufficiently specific, and the internals of the methods are too opaque to bear up to scrutiny. This article attempts to depict the internal flaws of the methods, with a particular focus on Nearest Shrunken Centroid.These methodological flaws arguably arise in part from a lack of rigor, but also from an impoverished treatment of the available data, focusing exclusively on comparative word frequencies within and across authors. A number of potentially fruitful directions that authorship studies are suggested, that could increase the robustness and accuracy of quantitative methods, as well as warn of the potential limits of such methods.

Highlights

  • Due to exponential increases in computing power and data storage over the last sixty years, quantitative analysis is capable of feats of which we could not even conceive prior to the advent of computer technology

  • Researchers have pointed out how sheer quantity of data can allow for new successes with analyses that previously delivered poor results: speech recognition, spam filtering, and image classification are only three particular domains where refinement of existing methods has yielded monumental improvements primarily due to a massive increase in the amount of data being analyzed.[1]

  • This paper aims to examine these computational methodologies in particular with an eye toward opening up their results

Read more

Summary

Introduction

Due to exponential increases in computing power and data storage over the last sixty years, quantitative analysis is capable of feats of which we could not even conceive prior to the advent of computer technology. For the two reasons of the infrequency of most of the words under consideration in Arden attribution and the general lack of accuracy in existing Delta tests, Elliott and Greatley-Hirsch’s Delta results do not bear scrutiny. 17 Elliott and Greatley-Hirsch’s Random Forest tests on function words attribute five and four segments of Arden to Kyd, but this merely underscores how striking it is that Random Forests would produce uniform results in favour of one author.

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call