Abstract
We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. We have analyzed three established lexica of affective word usage in English, German, and Spanish, to verify that these lexica have a neutral, unbiased, emotional content. Taking into account the frequency of word usage, we find that words with a positive emotional content are more frequently used. This lends support to Pollyanna hypothesis \cite{Boucher1969} that there should be a positive bias in human expression. We also find that negative words contain more information than positive words, as the informativeness of a word increases uniformly with its valence decrease. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links.
Highlights
One would argue that human languages, in order to facilitate social relations, should be biased towards positive emotions
This question becomes relevant for sentiment classification, as many tools assume as null hypothesis that human expression has neutral emotional content [, ], or reweight positive and negative emotions [ ] without a quantification of the positive bias of emotional expression
Considering, the everyday usage frequency of these words we find that the overall emotion of the three languages is strongly biased towards positive values, because words associated with a positive emotion are more frequently used than those associated with a negative emotion
Summary
One would argue that human languages, in order to facilitate social relations, should be biased towards positive emotions. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links. Our work focuses on one particular aspect of meaning, namely the emotion expressed in a word, and how this is related to word frequency and information content.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have