Objective. Speech comprehension involves detecting words and interpreting their meaning according to the preceding semantic context. This process is thought to be underpinned by a predictive neural system that uses that context to anticipate upcoming words. However, previous studies relied on evaluation metrics designed for continuous univariate sound features, overlooking the discrete and sparse nature of word-level features. This mismatch has limited effect sizes and hampered progress in understanding lexical prediction mechanisms in ecologically-valid experiments.Approach. We investigate these limitations by analyzing both simulated and actual electroencephalography (EEG) signals recorded during a speech comprehension task. We then introduce two novel assessment metrics tailored to capture the neural encoding of lexical surprise, improving upon traditional evaluation approaches.Main results. The proposed metrics demonstrated effect-sizes over 140% larger than those achieved with the conventional temporal response function (TRF) evaluation. These improvements were consistent across both simulated and real EEG datasets.Significance. Our findings substantially advance methods for evaluating lexical prediction in neural data, enabling more precise measurements and deeper insights into how the brain builds predictive representations during speech comprehension. These contributions open new avenues for research into predictive coding mechanisms in naturalistic language processing.
Read full abstract