Close Readings of Big Data: Triangulating Patterns of Textual Reappearance and Attribution in the Caledonian Mercury, 1820–40

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

This essay demonstrates how the iterative use of close and distant reading with historical newspapers can provide new and complementary evidence of the role of scissors-and-paste journalism, or reprinting, in the spread of news content. Using Gale's nineteenth-century British newspaper collections, this paper suggests how best to read the evidence of duplicated content obtained through text mining and explores the extent to which this level of analysis can distinguish between different editorial or production styles. Delving into a close reading of the Caledonian Mercury between 1820 and 1840, this study then tests hypotheses about word count and publication frequency developed through distant reading and determines its most common editorial structures. The study concludes with an exploration of how to extrapolate conclusions from close readings to support a more nuanced understanding of the results of large-scale textual analyses. Overall, it argues that iterative testing through both big data and close reading methodologies, a so-called middle-scale analysis, provides a better method for understanding the ambiguous and shifting structures of nineteenth-century newspapers as well as the points of connection between them.

Similar Papers
  • Dissertation
  • 10.33540/1046
Talking XTC
  • Jan 21, 2022
  • Berrie Jens Van Der Molen

Digital search and visualisation technologies are combined into one methodological approach for structural public debate analysis of digital print and audiovisual media data archives called the “leveled approach”. This approach is conceptualised, developed and used for research into drug discourse in Dutch news media debates in this thesis, which consists ... read more of four studies into the reputation of drugs in post-war Dutch newspaper and radio debates. As each study contributes to digital method development in Digital Humanities and to the field of drug history, a section describing the digital search and analysis trajectory in the digitised media archive (distant reading) precedes each historical narrative (close reading). The four studies explore how the reputation of amphetamine (in chapter 1) and ecstasy (in chapters 2, 3 and 4) developed in a context of national drug regulation in the Netherlands. In this way, the hypothesis that Dutch drug regulation has been subject to an increasingly strong imperative to regulate in the post-war period is studied in the media domain. The findings of the four studies lead to three main conclusions about the development of the reputation of drugs in a context of discursive dynamics specific to the newspaper and radio debates. First, the discursive formation of drugs developed at a pace that was to some degree independent from developments in drug regulation: public unrest in the newspapers preceded amphetamine regulation, while ecstasy was commonly treated as a soft drug on the radio for many years after being classified as a hard drug. Second, the reputation of these drugs developed in a cross-media landscape in which international issues and local issues also had significant effects. Third, the discursive formation of ecstasy is best understood as multifaceted and contested, revolving around contrasting discursive strands defined by meaning constellations of 1) descriptions of the substance; 2) commonly connected actors; and 3) settings. In newspaper articles these discursive strands appeared mostly independently from each other, whereas they were most obvious in clashes between disagreeing stakeholders in discussions on the radio. This shows that analysing radio and newspaper archives enables an enriched perspective on historical cross-media debates. I suggest two leads for further structural research of digitised media debates. First, the leveled approach can be used as a structural framework for combining distant and close reading in OCR- and/or ASR metadata-enriched archives. This makes possible cross-media public debate research across print and audiovisual media archives. Second, this thesis’ consistent explication of the search and visualisation trajectory - the explication of the iterative space between distant and close reading - shows how to achieve a level of transparency that fosters improved opportunities for self reflection and peer review for cross-media public debate analysis based on distant and close reading. Moreover, this practice makes it possible to answer historical research questions using analysis of digital media data archives that face challenges related to (meta)data scarcity, uneven/changing (meta)data availability and continuous technological change. show less

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.5204/mcj.2871
#FreeBritney and the Pleasures of Conspiracy
  • Mar 17, 2022
  • M/C Journal
  • Naomi Smith + 1 more

#FreeBritney and the Pleasures of Conspiracy

  • Research Article
  • Cite Count Icon 1
  • 10.15122/isbn.978-2-8124-2126-6
Lire de près, de loin. Close vs distant reading
  • Jan 1, 2014
  • Maria Hermínia Amado Laurel + 2 more

Since the work of Franco Moretti, 'distant reading' has become a research method in literature departments. But what is at stake in this kind of reading? Are 'close reading' and 'distant reading' really opposites or can both these approaches be employed at the same time?

  • Research Article
  • 10.14232/americana.2025.1.70-83
Escalated Reading
  • Nov 27, 2025
  • AMERICANA E-journal of American Studies in Hungary
  • György Fogarasi

In recent decades, the controversy over distant vs. close reading has revolved around the spatiotemporal question of scaling. Participants in the debate have either advocated distance (or speed) or have insisted on proximity (or slowness). On a meta-critical level, some have even argued for the need for any reading to be able to shift between, and thus to combine, different scales. Very little has been said, however, about the limitations of scaling as such, and the irreducibility of reading to the logic of scales. Starting out from a few intricate formulations by some proponents of close and distant reading, this paper attempts to investigate the potentials and limitations of scaling, first by references to “Stanford” (the university as well as its founder), then by looking into Walter Benjamin’s treatment of film, and finally, though most importantly, by re-reading some passages in Poe’s detective story “The Purloined Letter.” These three points of reference (Stanford, Benjamin, Poe) seem analogous in the way they lay mutual emphasis on both serialization and segmentation, fast and slow motion, or distance and proximity. On a closer (or more distant?) look, however, Poe’s text goes even beyond such a scheme of scaling. It testifies to a logic of detection which surpasses mere zooming-in or zooming-out strategies, and points to a notion of reading that is “escalated” not simply because of its extraordinary range in terms of velocity or distance, but more radically because, although it still binds reading to specific scales, it also has an aspect that remains utterly heterogeneous to any logic of scaling. The paper attempts to highlight this radically “escalated” (out-of-scale) aspect of reading.

  • Research Article
  • 10.1353/gyr.2020.0004
Forum: Canon versus "The Great Unread"
  • Jan 1, 2020
  • Goethe Yearbook
  • Birgit Tautz + 1 more

Forum:Canon versus "The Great Unread" Birgit Tautz and Patricia Anne Simpson When we embarked on editing the Goethe Yearbook, we brainstormed ideas about formats for disseminating research that would usefully complement the stellar articles that appear annually. Our interest turned to the forum, a robust format that has fostered lively debate elsewhere (e.g., Eighteenth Century Theory and Interpretation) and has recently been popularized by our colleagues at the German Quarterly. Naturally, we zeroed in on a topic that is still underrepresented in the Yearbook but that has begun to alter the ways in which we approach the study of Goethe and, more broadly, the eighteenth century—within our comparatively small field in North America, as well as in Germany and in adjacent disciplines invested in the period (e.g., comparative literature and comparative cultural studies, genre studies, English, Atlantic studies, and history). We are, of course, speaking of Digital Humanities (DH). In the process of identifying experts in the field, we discovered that a few years ago graduate programs in German (at Yale, the University of Chicago, and Konstanz) had devoted a short course to the topic that inspired the title of our inaugural forum. As we approached potential contributors, we posed a series of questions, intended to spark not direct answers, but to serve as an impulse for reflection: What is the canon? How do we define it and how has it been reenvisioned beyond DH? What is the relationship between "mining" thousands of texts through algorithms and scholarship "merely" based on the interpretation of select literary works? What are the consequences of digitizing primary materials? How do DH methodologies and analytical practices enhance and/or endanger the study of the canon? How does "close reading" versus "distant reading" affect the legacy of canonical authors and their impact on the construction of national literary historiography in the nineteenth century? What is at stake for the discipline of literary study—for the act of (close) reading—when we ask the question about the canon versus the "great unread"? Nine colleagues who are engaged in the theory and practice of DH scholarship responded to our call. The scope of their work is impressive, providing detailed yet suggestive overviews of DH methodologies, insights into the importance of DH and its ability to recuperate historically marginalized writers, case studies of temporary canonicity, and challenges to canonical approaches to the Goethezeit. In framing the debate, we kept in mind the larger context of German studies, while assuming an uncontested relevance of literature and textual studies, certainly among the readers of the Goethe Yearbook. And while we [End Page 187] recognized the pitfalls of posing canonical literature as "read" in opposition to a virtually boundless spectrum of texts that can be analyzed only as data, we hoped to prompt a less polarized discussion about the imagined impact of DH and "computational criticism" on our field. We wanted to create a section that allows scholars—whether they are newcomers or well-versed in DH, interested in or deeply skeptical about data—to glimpse the innovative field's rich opportunities, its first instances of obsolescence, even its evident shortfalls; our goal is to allow our readers to decide for themselves whether to read broadly, which directions to pursue further, or whether to disregard the field completely. We invite continuous engagement with the contributions, not to succumb to a trend, but to continue the dialogue. The following essays impressively show that our aim for open discussions was spot-on. The contributors not only address ways in which DH can broaden an understanding of our field, but they also identify new challenges that arise; quite a few returned to the original meaning of "the great unread" in Margaret Cohen's formulation, namely the fact that canon formation has always implied a curtailing of tradition (as opposed to the texts produced in any given period). Each contribution reveals, in unique ways, not only that possible definitions of and approaches to DH are about as manifold as its projects and practitioners, but that the field has begun what we may call its own historicization; it now encompasses digital preservation, humanistic inquiry about digital objects (text, image, space, networks...

  • PDF Download Icon
  • Research Article
  • 10.3389/fdata.2021.723043
Representation of Jews and Anti-Jewish Bias in 19th Century French Public Discourse: Distant and Close Reading
  • Jan 26, 2022
  • Frontiers in Big Data
  • Simon Levis Sullam + 3 more

We explore through the lens of distant reading the evolution of discourse on Jews in France during the XIX century. We analyze a large textual corpus including heterogeneous sources—literary works, periodicals, songs, essays, historical narratives—to trace how Jews are associated to different semantic domains, and how such associations shift over time. Our analysis deals with three key aspects of such changes: the overall transformation of embedding spaces, the trajectories of word associations, and the comparative projection of different religious groups over different, historically relevant semantic dimensions or streams of discourse. This allows to show changes in the association between words and semantic domains (referring e.g. to economic and moral behaviors), the evolution of stereotypes, and the dynamics of bias over a long time span characterized by major historical transformations. We suggest that the analysis of large textual corpora can be fruitfully used in a dialogue with more traditional close reading approaches—by pointing to opportunities of in-depth analyses that mobilize more qualitative approaches and a detailed inspection of the sources that distant reading inevitably tends to aggregate. We offer a short example of such a dialogue between different approaches in our discussion of the Second Empire transformations, where we mobilize the historian’s tools to start disentangling the complex interactions between changes in French society, the nature of sources, and representations of Jews. While our example is limited in scope, we foresee large potential payoffs in the cooperative interaction between distant and close reading.

  • Research Article
  • Cite Count Icon 1
  • 10.1093/llc/fqz012
Patterns in language: Text analysis of government reports on the Irish industrial school system with word embedding
  • Apr 10, 2019
  • Digital Scholarship in the Humanities
  • Susan Leavy + 2 more

Industrial Memories is a digital humanities initiative to supplement close readings of a government report with new distant readings, using text analytics techniques. The Ryan Report (2009), the official report of the Commission to Inquire into Child Abuse (CICA), details the systematic abuse of thousands of children from 1936 to 1999 in residential institutions run by religious orders and funded and overseen by the Irish State. Arguably, the sheer size of the Ryan Report—over 1 million words—warrants a new approach that blends close readings to witness its findings, with distant readings that help surface system-wide findings embedded in the Report. Although CICA has been lauded internationally for its work, many have critiqued the narrative form of the Ryan Report, for obfuscating key findings and providing poor systemic, statistical summaries that are crucial to evaluating the political and cultural context in which the abuse took place (Keenan, 2013, Child Sexual Abuse and the Catholic Church: Gender, Power, and Organizational Culture. Oxford University Press). In this article, we concentrate on describing the distant reading methodology we adopted, using machine learning and text-analytic methods and report on what they surfaced from the Report. The contribution of this work is threefold: (i) it shows how text analytics can be used to surface new patterns, summaries and results that were not apparent via close reading, (ii) it demonstrates how machine learning can be used to annotate text by using word embedding to compile domain-specific semantic lexicons for feature extraction and (iii) it demonstrates how digital humanities methods can be applied to an official state inquiry with social justice impact.

  • Research Article
  • 10.6084/m9.figshare.3475418.v1
Boutique Big Data: reintegrating close and distant reading of 19th-Century newspapers
  • Aug 10, 2019
  • Melodee Beals

From their earliest incarnations in the seventeenth-century, through their Georgian expansion into provincial and colonial markets and culminating in their late-Victorian transformation into New Journalism, British newspapers have relied upon scissors-and-paste journalism to meet consumer demands for the latest political intelligence and diverting content. Although this practice, wherein one newspaper extracted or wholly duplicated content from another, is well known to scholars of the periodical press, in-depth analysis of the process is hindered by the lack of formal records relating to the reprinting process. Although anecdotes abound, attributions were rarely and inconsistently given and, with no legal requirement to recompense the original author, formal records of where material was obtained were unnecessary. Even if they had existed, the number of titles that relied upon reprinted material makes systematic analysis impossible; for many periodicals, only a few issues, let alone business records, survive. However, mass digitisation of these periodicals, in both photographic and machine-readable form, offers historians a new opportunity to rediscover the mechanics of nineteenth-century reprinting. By undertaking multi-modal and multi-scalar analyses of digitised periodicals, we can begin to reconstruct the precise journeys these texts took from their first appearance to their multiple ends. Before the advent of the telegraph, individual texts were disseminated manually, through postal and private correspondence routes, over sea and land. This allowed for the relatively slow spread of texts across communication networks, as well their adaptation, truncation and expansion various stages. In a manner similar to modern internet memes, blogs and online news content, texts underwent evolutionary changes with each reprinting. These could be minute, such as the correction of spelling errors or the application of house style, or significant, through selective reordering and truncation to alter the overall meaning of the text. While identifying meme families, or collections of related texts, can help us understand what made particularly texts popular, or viral, it is only by tracing the specific trajectories and pathways of these texts that the causes and consequences of evolutionary changes can be understood. Doing so requires us to approach these texts on multiple scales. First, by mining extremely large corpora, derived from several independent collections, we are able to identify a statistically sufficient portion of the historical network. Then, by carefully analysing the chronology and discrepancies between these reprints, hypotheses regarding institutional and industry standards can be posited and tested against the wider corpus. These efforts can be further buttressed by utilising manual transcriptions found in the personal archives of researchers using historical newspapers, such as the Scissors and Paste Database (www.scissorsandpaste.net). These transcriptions, far more accurate than the majority of datasets derived from optical character recognition, greatly improve the mining of the corpora, yielding a more complete initial network to analyse, as well as offset the skewing effect of the ‘offline penumbra’. This poster will explore the possibilities of large-scale reprint identification within and across digitised collections using a combination of Lou Bloomfield’s Copyfind and project-specific code to identify matches between individual articles or full pages of texts in both manual (perfect) and OCR (messy) transcriptions. Exemplar collections include the British Library's 19th-Century Newspapers digital collection and planned expansions into the digital collections of the National Library of Wales (Welsh Newspaper Online) and of Australia (Trove). The poster will also demonstrate the means by which reprint branching can be mapped using chronology and character clustering and the relative precision of manual and computer-aided techniques. Finally, it will explore the nature of multi-scalar analysis and how we might best reintegrate ‘boutique’ periodical research, such as the author’s Scissors and Paste Database, into large-scale text-mining projects.

  • Dissertation
  • 10.14264/uql.2019.63
Becoming Ali: digital history, newspaper discourse, and America’s most famous boxer, 1960–1975
  • Jan 18, 2019
  • Stephen Townsend

Cassius Marcellus Clay Jr. publically changed his name to Muhammad Ali on 6 March 1964. In doing so, he signalled his allegiance to the Nation of Islam – a controversial religious sect that advocated racial separatism and black nationalism – as well as his intention to defy established cultural expectations for black athletes in the United States. Through his name change, he provoked diverse reactions from the media that changed over time. This study analyses the discursive significance of Ali’s two names – Clay and Ali – as a way to analyse complex and shifting journalistic attitudes toward him between 1960 and 1975. To do so, it employs a mix of digital and traditional methodologies: specifically, distant and close reading. As such, this thesis is part of a growing body of digitally driven scholarship that is re-shaping sport history in the new millennium.The foundation of this study is a distant reading of almost 40,000 articles written about Ali between 1960 and 1975 from 13 newspapers. This group of publications was selected to be geographically and culturally diverse, and includes three major white-run dailies and ten black newspapers from across the United States. Distant reading – a form of quantitative analyses that uses graphical representations to visualise trends and themes within large bodies of literature – indicates that rather than moving gradually toward acceptance of his Muslim name and its associated identity, journalists shifted their attitudes toward Ali at three key junctures. In March 1964, journalists overwhelmingly referred to him as Cassius Clay, not Muhammad Ali. This practice continued until September 1967, when newspapers began to print the two names almost interchangeably. The final shift occurred in March 1971, when journalists reversed their earlier rejection of the Muslim name completely and began referring to him almost exclusively as Muhammad Ali.Guided by the shifts identified by distant reading, this thesis then moves to a detailed close reading of individual articles with the aim of uncovering the deep, discursive forces that shaped usage of Ali’s two names. An analysis of articles published between March 1964 and September 1967 reveals that although both black and white newspapers comprehensively rejected Ali’s Muslim name, there were important differences in their motivations. The rejection of the name by white newspapers was symptomatic of their broader refusal to engage critically with racial issues during the mid-1960s. By comparison, the black press rejected the name because it signified Ali’s affiliation with the Nation of Islam, whose program of black nationalism and racial separatism threatened to undermine the integration movement.The relatively interchange way that newspapers used the two names between September 1967 and March 1971 was influenced predominantly by Ali’s refusal to be drafted into the United States Army. Close reading also reveals a number of deeper discursive factors that prompted journalists to display less animosity toward Ali’s Muslim name. Ali’s punishment at the hands of legal and athletic authorities earned him a measure of sympathy from the press. However, close reading indicates that this changing personal narrative was augmented by broader cultural shifts occurring in the United States throughout this period. The rise of radical black power groups and growing criticism of the Vietnam War made Ali’s activism appear increasingly moderate by comparison, and enhanced his appeal to mainstream audiences. The influence of these personal and cultural factors culminated in March 1971, after which journalists referred to him almost exclusively as Muhammad Ali.These trends are then examined within a single publication: the Louisville Defender. Examining journalistic narratives from the Defender – Ali’s hometown black newspaper – enables a more granular examination of the factors that shaped press attitudes toward the boxer. By incorporating analysis of Louisville’s unique racial culture as well as the influence of individual personalities at the Defender, this close reading further reveals the diversity of attitudes toward Ali across the United States. Rather than being swayed by hometown parochialism, the Defender energetically critiqued Ali’s racial and religious beliefs and aligned itself with the attitudes of other black publications around the country. Distant and close readings show that American newspapers did not embrace the name Muhammad Ali until March 1971. At the height of his career, he provoked complex and critical reactions from journalists with a diverse range of racial, religious, political, cultural and geographical backgrounds. Modern cultural memories of the late boxer tend to eschew these aspects of Ali’s cultural identity, favouring more benevolent visions of the late boxer as a peacemaker or civil rights hero. By analysing shifting attitudes toward Ali between 1960 and 1975, and interrogating the complex discursive factors that drove these shifts, this thesis contributes to a more nuanced historical understanding of his cultural significance.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.33186/1027-3689-2019-10-56-67
Distant reading as a strategy of an exact study of bibliography
  • Oct 5, 2019
  • Scientific and Technical Libraries
  • V P Leonov

In recent times marked by the offensive and largescale advancement of computer technol-ogy numerous libraries, scientific and educational centers in the world are creating their own extensive databases of literary and bibliographic texts. Facing such databases the close reading method designed to work with specific texts would seem to lose its meaning. The Italian sociolo-gist and literary critic Franco Moretti became the main critic of the close reading. He presented his ideas in the book «Distant Reading». This book can be viewed as a program to update the methodology of studying world literature. Moretti believes that the world literature should be studied not by looking at the details, but by examining it from a long distance: studying hudreds and thousands of texts. He suggest to use the Digital Humanities (DH) methods, i.e. to ap-ply digital (computer) methods in the humanities. To show the reasons for the survival of certain types of texts, Moretti compares literary processes with biological ones and draws an anology between natural selection and reader selection. Moretti’s predecessor, who first used quantitative methods in literary studies and saw common ground between literary and biological processes, was the author of the fundamental monograph “Methodology of an exact study of literature” B. I. Yarkho (18891942).Moretti’s book “Distant Reading” shatters stereotypes of the bibliographic environment. It is directed no to the study of close (slow) reading, but to the study of the entire world docmentary flow. This approach opens the way to the use of quantitative methods in the study of world bibliography. A new research strategy “exact study of bibliography” will be formed as part of digital and automated text processing.

  • Research Article
  • Cite Count Icon 3
  • 10.1353/ems.2011.0006
And—?: Using Digital Tools to Reread The Canterbury Tales
  • Jan 1, 2011
  • Essays in Medieval Studies
  • Patrick J Mcmahon + 1 more

And—?Using Digital Tools to Reread The Canterbury Tales Patrick J. McMahon and Allen J. Frantzen Teachers and scholars of medieval literature have long championed close reading. Some teachers lament the influence of literary theories that, once reduced to ideological criticism, seem to replace close reading and to encourage students to impose meaning on texts rather than discover ways in which meaning is created by language. Our aim is to show how newly developed digital technologies promote close reading and help to reinvigorate the study of language as a component of literary meaning. These technologies, explored within the new discipline of Digital Humanities, show that both the text and the classroom can be seen as laboratories for the exploration and discovery of meaning and the processes that shape it. Our example involves Chaucer's most frequently-used (if not his favorite) word, and. Digital Humanities (DH) has been defined by Matthew Kirschenbaum as "a field of study, research, teaching, and invention concerned with the intersection of computing and the disciplines of the humanities." DH "involves investigation, analysis, synthesis and presentation of information in electronic form," including the study of "how these media affect the disciplines in which they are used."1 DH embraces some new and expensive kinds of software, but one DH tool, the concordance, has long been a staple of medieval literary study. Many concordances are now available online, and like some other powerful tools, including The Middle English Dictionary, are free.2 Such tools enable what is, in DH, called "distant reading," and distant reading, as we will show, is an important way to assist close reading. Distant reading is a term used to describe the work of Franco Moretti, a scholar more famous for counting novels than reading them. Critical of "the minimal fraction of the literary field we work on," Moretti and his followers use statistical trends (involving length of novels, for example, and their sales) to illuminate the history of publishing and the history of public taste.3 Their idea is not to broaden the canon of works that [End Page 133] are interpreted but rather to count works published and to analyze the distribution of works instead of their content. Our application of distant reading is different. We use the term as a way to approach words in texts rather than books on the shelf, although we too are concerned with distribution and patterns rather than interpretation. We think of distant reading as the use of computational resources to identify language patterns that human readers either overlook or cannot see without the help of machines. A concordance offers a perspective on an author's corpus that would take any reader a long time to create. Users of such tools for vernacular languages need to know many things, especially that orthographical variants have to be accommodated, including i for y spellings, inflections, and vowel changes. Such matters, of course, are not obstacles. Rather, they are important components of learning how medieval languages work. Ordinary DH tools go far beyond the concordance in reassembling texts into new units. They can list sentences according to their length, their use of prepositional phrases, their density as measured by nouns, and countless other criteria. Although this sounds like something new, David L. Hoover has shown that modern quantitative studies date from the 1850s, when attempts were made to answer questions concerning the attribution of anonymous works by analyzing vocabulary and other aspects of textual composition.4 Medievalists interested in DH and distant reading can see impressive results in the research of Michael Witmore and Jonathan Hope.5 In "The Hundredth Psalm to the Tune of 'Green Sleeves': Digital Approaches to Shakespeare's Language of Genre," the authors address the linguistic makeup of Shakespeare's genres. Using a program called DocuScope, they "offer a portrait of Shakespearean genre at the level of the sentence, showing how an identification of frequently iterated combinations of words (either in their presence or absence) can allow us to appreciate the integrity and fluidity of Shakespeare's genres."6 Witmore and Hope see texts as two different types of objects: first, as historical objects and theatrical performances once acted out by real people on...

  • Research Article
  • Cite Count Icon 7
  • 10.1353/nlh.2015.0023
Statistical Analysis at the Birth of Close Reading
  • Jun 1, 2015
  • New Literary History
  • Yohei Igarashi

What is the actual relation between close reading and non-close methods of textual analysis? Connecting Edward Lee Thorndike’s The Teacher’s Word Book (1921), C. K. Ogden and I. A. Richards’s universal language (Basic English), and Richards’s inaugural theories of close reading, the essay demonstrates that the inception of close reading was shaped by its era’s statistical analyses or “distant reading,” particularly the genre of the word list. The second part of the essay tracks the subsequent divergence of close reading and statistical analysis by considering two exemplary developments: research into the measurement of “readability,” and Cleanth Brooks’s notion of “the heresy of paraphrase.” Ultimately, the essay aims to fine-tune discussions of close and distant reading that have been occasioned by the digital humanities and suggests that literary studies can once again learn from, and contribute to, the field of reading research.

  • Research Article
  • Cite Count Icon 1
  • 10.5325/complitstudies.57.4.0585
Introduction: The Interactive Relations Between Science and Technology and Literary Studies
  • Dec 1, 2020
  • Comparative Literature Studies
  • Ning Wang

Introduction: The Interactive Relations Between Science and Technology and Literary Studies

  • Research Article
  • Cite Count Icon 102
  • 10.1111/cgf.12873
Visual Text Analysis in Digital Humanities
  • Jun 20, 2016
  • Computer Graphics Forum
  • S Jänicke + 3 more

In 2005, Franco Moretti introduced Distant Reading to analyse entire literary text collections. This was a rather revolutionary idea compared to the traditional Close Reading, which focuses on the thorough interpretation of an individual work. Both reading techniques are the prior means of Visual Text Analysis. We present an overview of the research conducted since 2005 on supporting text analysis tasks with close and distant reading visualizations in the digital humanities. Therefore, we classify the observed papers according to a taxonomy of text analysis tasks, categorize applied close and distant reading techniques to support the investigation of these tasks and illustrate approaches that combine both reading techniques in order to provide a multi‐faceted view of the textual data. In addition, we take a look at the used text sources and at the typical data transformation steps required for the proposed visualizations. Finally, we summarize collaboration experiences when developing visualizations for close and distant reading, and we give an outlook on future challenges in that research area.

  • Dissertation
  • 10.25394/pgs.15057330.v1
Trauma in the Syntax: Trauma Writing in David Foster Wallace's Infinite Jest
  • Jul 27, 2021
  • Alyssa Caroline Fernandez

This project presents a case study of postmodern trauma, working at the boundaries of the humanities and computer science to produce an in-depth examination of trauma writing in David Foster Wallace’s novel Infinite Jest. The goal of this project is to examine the intricacies of syntax and language in postmodern trauma writing through an iterative process I refer to as broken reading, which combines traditional humanities methodologies (close reading) and distant, computational methodologies (Natural Language Processing). Broken reading begins with close reading, then ventures into the distant reading processes of sentiment analysis and entity analysis, and then returns again to close reading when the data must be analyzed and the broken computational elements must be corrected. While examining the syntactical structure of traumatic and non-traumatic passages through this broken reading methodology, I found that Wallace represents trauma as gendered. The male characters in the novel, when recollecting past traumata or undergoing traumatic events, maintain their subject status, recognize those around them as subjects, and are able to engage actively with the world around them. On the other hand, the female characters in the novel are depicted as lacking the same capacities for subjectivity and action. Through computational text analysis, it becomes clear that Wallace writes female trauma in a way that reflects their lack of agency and subjectivity while he writes male trauma in a way that maintains their agency and subjectivity. Through close reading, I was able to discover qualitative differences in Wallace’s representations of trauma and form initial observations about syntactical and linguistic patterns; through distant reading, I was able to quantify the differences I uncovered through close reading by conducting part of speech tagging, entity analysis, semantic analysis, and sentiment analysis. Distant reading led me to discover elements of the text that I had not noticed previously, despite the occasional flaw in computation. The analyses I produced through this broken reading process grew richer because of failure—when I failed as an interpreter, and when computational analysis failed, these failures gave me further insight into the trauma writing within the novel. Ultimately, there are marked syntactical and linguistic differences between the way that Wallace represents male and female trauma, which points toward the larger question of whether other white male postmodern authors gender trauma in their writings, too. This study has generated a prototype model for the broken reading methodology, which can be used to further examine postmodern trauma writing.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.