Abstract

We review recent evidence indicating that researchers in experimental psychology may have used suboptimal estimates of word frequency. Word frequency measures should be based on a corpus of at least 20 million words that contains language participants in psychology experiments are likely to have been exposed to. In addition, the quality of word frequency measures should be ascertained by correlating them with behavioral word processing data. When we apply these criteria to the word frequency measures available for the German language, we find that the commonly used Celex frequencies are the least powerful to predict lexical decision times. Better results are obtained with the Leipzig frequencies, the dlexDB frequencies, and the Google Books 2000-2009 frequencies. However, as in other languages the best performance is observed with subtitle-based word frequencies. The SUBTLEX-DE word frequencies collected for the present ms are made available in easy-to-use files and are free for educational purposes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call