Abstract

Abstract Automated coherence metrics constitute an efficient and popular way to evaluate topic models. Previous works present a mixed picture of their presumed correlation with human judgment. This work proposes a novel sampling approach to mine topic representations at a large-scale while seeking to mitigate bias from sampling, enabling the investigation of widely-used automated coherence metrics via large corpora. Additionally, this article proposes a novel user study design, an amalgamation of different proxy tasks, to derive a finer insight into the human decision-making processes. This design subsumes the purpose of simple rating and outlier-detection user studies. Similar to the sampling approach, the user study conducted is very extensive, comprising forty study participants split into eight different study groups tasked with evaluating their respective set of one hundred topic representations. Usually, when substantiating the use of these metrics, human responses are treated as the golden standard. This article further investigates the reliability of human judgment by flipping the comparison and conducting a novel extended analysis of human response at the group and individual level against a generic corpus. The investigation results show a moderate to good correlation between these metrics and human judgment, especially for generic corpora, and derive further insights into the human perception of coherence. Analysing inter-metric correlations across corpora shows moderate to good correlation amongst these metrics. As these metrics depend on corpus statistics, this article further investigates the topical differences between corpora revealing nuances in applications of these metrics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.