Abstract

AbstractOnline news article recommendations are typically of the ‘more like this’ type, generated by similarity functions. Across three studies, we examined the representativeness of different similarity functions for news item retrieval, by comparing them to human judgments of similarity. In Study 1 ($$N=401$$ N = 401 ), participants assessed the overall similarity of ten randomly paired news articles on politics and compared their judgments to different feature-specific similarity functions (e.g., based on body text or images). In Study 2, we checked for domain differences in a mixed-methods survey ($$N=45$$ N = 45 ), surfacing evidence that the effectiveness of similarity functions differs across different news categories (‘Recent Events’, ‘Sport’). In Study 3 ($$N=173$$ N = 173 ), we improved the design of Study 1, by controlling for how news articles were matched, differentiating between dissimilar news articles and articles that were matched on a shared topic, named entities, and/or date of publication, across ‘Recent Events’ and ‘Sport’ categories. Across all studies, we found that users mostly used text-based features (e.g., body text, title) for their similarity judgments, while BodyText:TF-IDF was found to be the most representative for their judgments. Moreover, the strength of similarity judgments by humans and similarity scores by feature-specific functions was strongly affected by how news article pairs were matched. We show that humans and similarity functions are better aligned when two news articles are more alike, such as in a news recommendation scenario.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.