Facing linguistic barriers to communication, talkers spontaneously adjust their speech, facilitating perception by their listener. These adjustments characteristically involve hyperarticulation, such as reduced speech rate, increased vowel dispersion, and increased segmental duration. Listener-based hyperarticulation could result from listener modeling, in which the talker monitors their listener’s comprehension. Hyperarticulation could also result from rich phonological representations, in which listener-based variation is directly represented. This study examined predictions of monitoring and representational accounts of listener-based hyperarticulation using a perception task. Participants completed a forced-choice identification task which tasked them with identifying the background of an interlocutor from recorded speech. That speech was originally produced to imagined and real and to native and non-native interlocutors. If listener-based variation originates with listener modeling, accuracy should be high and similar for each listener background. If that variation originates from rich representations, accuracy should be low, variable, and different across listener backgrounds. The results revealed poor accuracy, even for speech directed toward imagined non-natives, the most hyperarticulate speech. Participants’ choices were influenced most by speech rate, second by vowel dispersion, but not by vowel duration. We discuss the implications of these results for cognitive models of listener-based phonetic variation.