Abstract
The digital era has not only given us the world of big data, but the tools to deal with this mostly unstructured, mostly textual data: Natural Language Processing (NLP) tools. Yet, most humanists and social scientists do not work with big data. They do not deal with millions of documents. Literary critics’ corpora are the handful of works produced by an author. The historians’ primary source documents number in the tens, perhaps hundreds. Social scientists deal with tens or hundreds of transcripts of focus groups and in-depth interviews, or at most a few thousand media articles. And they analyze these data either qualitatively or quantitatively with a variety of manual or computer-assisted methodologies, from content analysis to frame analysis, discourse analysis, quantitative narrative analysis. But, once developed, at least some of the NLP tools of automatic textual analysis and the data analytics visualization tools, can be applied not just to big data but to small data as well. This paper illustrates how some of these tools can be used by focusing on a short first-person narrative. And the NLP tools reveal patterns of language use perhaps not immediately discernible, thus proving useful in the analysis of even small data. But understanding and interpreting these patterns requires knowledge way beyond the NLP tools themselves. Humanists and social scientists need not fear computer scientists; rather, they need to learn to take advantage of them. NLP tools lay a bridge between quality and quantity, with much to be gained from a constant interaction between distant and close reading.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.