Abstract

Word frequency influences both the production and perception of speech. Speakers increase stress for infrequent words, and poor listening conditions cause listeners to mistake rare words for common ones, but not vice versa. People can also directly estimate relative word frequencies to within one order of magnitude of their objective values. To determine if word frequency effects were determined at least in part by phonotactics (versus by semantics), subjects were asked to estimate word frequencies for a list of words containing both real and nonsense words. Subjects duplicated previous results in their ability to judge the frequency of real words, and showed significant agreement in their judgements of “frequencies” for nonsense words. Subjects' judgments for the frequency of nonsense words showed significant correlation with Greenberg and Jenkins' measure for distance from English, implying that word frequencies are judged by estimating the density of similar sounding words in the lexicon. These results suggest additions to lexical distance metrics that would improve performance of speech recognition systems: small vocabulary systems could tabulate word frequency during system usage and bias recognition towards more frequently used words, while large vocabulary systems could use lexical density to modulate calculated distances between candidate words and the input utterance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call