Abstract

In the present study, we provide a comprehensive analysis and a multi-dimensional dataset of semantic transparency measures for 1810 German compound words. Compound words are considered semantically transparent when the contribution of the constituents' meaning to the compound meaning is clear (as in airport), but the degree of semantic transparency varies between compounds (compare strawberry or sandman). Our dataset includes both compositional and relatedness-based semantic transparency measures, also differentiated by constituents. The measures are obtained from a computational and fully implemented semantic model based on distributional semantics. We validate the measures using data from four behavioral experiments: Explicit transparency ratings, two different lexical decision tasks using different nonwords, and an eye-tracking study. We demonstrate that different semantic effects emerge in different behavioral tasks, which can only be captured using a multi-dimensional approach to semantic transparency. We further provide the semantic transparency measures derived from the model for a dataset of 40,475 additional German compounds, as well as for 2061 novel German compounds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call