Objective. To investigate how the auditory system processes natural speech, models have been created to relate the electroencephalography (EEG) signal of a person listening to speech to various representations of the speech. Mainly the speech envelope has been used, but also phonetic representations. We investigated to which degree of granularity phonetic representations can be related to the EEG signal. Approach. We used recorded EEG signals from 105 subjects while they listened to fairy tale stories. We utilized speech representations, including onset of any phone, vowel–consonant onsets, broad phonetic class (BPC) onsets, and narrow phonetic class onsets, and related them to EEG using forward modeling and match–mismatch tasks. In forward modeling, we used a linear model to predict EEG from speech representations. In the match–mismatch task, we trained a long short term memory based model to determine which of two candidate speech segments matches with a given EEG segment. Main results. Our results show that vowel–consonant onsets outperform onsets of any phone in both tasks, which suggests that neural tracking of the vowel vs. consonant exists in the EEG to some degree. We also observed that vowel (syllable nucleus) onsets exhibit a more consistent representation in EEG compared to syllable onsets. Significance. Finally, our findings suggest that neural tracking previously thought to be associated with BPCs might actually originate from vowel–consonant onsets rather than the differentiation between different phonetic classes.