Abstract
In recent years, numerous computational methods have been developed that have been widely used in humanities and literary studies. In spite of the potential of such methods in providing workable solutions to various inherent problems in research within these domains, including selectivity, objectivity, and replicability, very little empirical work has been done on thematic studies in literature. Such studies are almost entirely undertaken through traditional methods based on individual researchers’ reading of texts and intuitive abstraction of generalizations from their reading. This has negative implications in terms of issues of objectivity and replicability. Furthermore, there are challenges in dealing effectively with the hundreds of thousands of new novels that are published every year using traditional methods. In the face of these problems, this study proposes an integrated computational model for the thematic classification of literary texts based on lexical clustering methods. This study is based on a corpus comprising Thomas Hardy’s novels and short stories. The study employs computational semantic analysis based on a vector space model (VSM) representation of the lexical content of the texts. The results indicate that the selected texts could be grouped thematically based on their semantic content. Thus, there is now evidence that text clustering approaches, which have long been used in computational theory and data mining applications, can be usefully applied in literary studies.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.