Abstract

At present, numerous people watch prerecorded TV programs as daily leisure. Concerning soap operas or sports, the viewers may not want to be informed about the results before watching the programs; however, they may check tweets on devices, such as smartphones, some of which can accidentally include contents referring to spoilers. To avoid reading such contents, several approaches were proposed to detect spoilers in texts (both long and short ones), including tweets. In the study by Jeon et al. focused on detecting spoilers in tweets, only one person attached labels to tweets, and the labeled tweets were used to train detectors. The trained detector was tuned for one person and, therefore, could be unsuitable for others. A tweet published in the middle of a baseball game can be considered as a spoiler by some people and is not by others; therefore, a personalized detection method is preferred. However, to the best of our knowledge, none of the related studies has considered such a personalized approach. To address this problem, we propose a semi-supervised approach to detect spoilers in tweets using a Support Vector Machine (SVM) in which each user attaches labels to tweets. After that, SVM executes the same procedure for other unlabeled tweets through bootstrapping. To verify the suitability of the proposed approach to personalize detectors, we conducted an experiment in which two participants were asked to attach labels to tweets. The experimental results indicate that this approach is ef??icient for personalized detection based on the Mann-Whitney U test.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.