How do listeners integrate temporally distributed phonemic information into coherent representations of syllables and words? For example, increasing the silence interval between the words "gray chip" may result in the percept "great chip," whereas increasing the duration of fricative noise in "chip" may alter the percept to "great ship" (B. H. Repp, A. M. Liberman, T. Eccardt, & D. Pesetsky, 1978). The ARTWORD neural model quantitatively simulates such context-sensitive speech data. In ARTWORD, sequentially stored phonemic items in working memory provide bottom-up input to unitized list chunks that group together sequences of items of variable length. The list chunks compete with each other. The winning groupings feed back to establish a resonance which temporarily boosts the activation levels of selected items and chunks, thereby creating an emergent conscious percept whose properties match such data.