An auspicious but unexplored environment for studying phonetic variation in naturalistic interaction is where two or more participants say the same thing at the same time. Working with a core dataset built from the multimodal Augmented Multi-party Interaction corpus, the principles of conversation analysis were followed to analyze the sequential organization of the talk and to explain the phonetic variation observed. Acoustic divergence and equivalence between simultaneous responses are described. Phonetic features discussed include duration and timing, pitch, loudness, and phonation type. The interactional factors that explain the acoustic divergences are established through turn-by-turn analysis and consideration of gaze direction and other visible features. It is argued that any research on phonetic variation in naturalistic talk that disregards the local organization of interaction will always be incomplete.