Abstract
The main goal of this work is to provide the cognitive computing community with valuable resources to analyze and simulate the intentionality and/or emotions embedded in the language employed in social media. Specifically, it is focused on the Spanish language and online dialogues, leading to the creation of Sofoco (Spanish Online Forums Corpus). It is the first Spanish corpus consisting of dialogic debates extracted from social media and it is annotated by means of crowdsourcing in order to carry out automatic analysis of subjective language forms, like sarcasm or nastiness. Furthermore, the annotators were also asked about the context need when taking a decision. In this way, the users’ intentions and their behavior inside social networks can be better understood and more accurate text analysis is possible. An analysis of the annotation results is carried out and the reliability of the annotations is also explored. Additionally, sarcasm and nastiness detection results (around 0.76 F-Measure in both cases) are also reported. The obtained results show the presented corpus as a valuable resource that might be used in very diverse future work.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.