Time Period Categorization in Fiction: A Comparative Analysis of Machine Learning Techniques

Fereshta Westin

doi:10.1080/01639374.2024.2315548

Time Period Categorization in Fiction: A Comparative Analysis of Machine Learning Techniques

Fereshta Westin

Open Access

https://doi.org/10.1080/01639374.2024.2315548

Copy DOI

Journal: Cataloging & Classification Quarterly	Publication Date: Feb 7, 2024
License type: CC BY 4.0

#Analysis Of Machine Learning Techniques #Latent Dirichlet Allocation + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This study investigates the automatic categorization of time period metadata in fiction, a critical but often overlooked aspect of cataloging. Using a comparative analysis approach, the performance of three machine learning techniques, namely Latent Dirichlet Allocation (LDA), Sentence-BERT (SBERT), and Term Frequency-Inverse Document Frequency (TF-IDF) were assessed, by examining their precision, recall, F1 scores, and confusion matrix results. LDA identifies underlying topics within the text, TF-IDF measures word importance, and SBERT measures sentence semantic similarity. Based on F1-score analysis and confusion matrix outcomes, TF-IDF and LDA effectively categorize text data by time period, while SBERT performed poorly across all time period categories.

Full Text