Different voices may have persuasive effects on individuals’ decision-making processes; however, in the tourism context, little attention has been paid to online paid tour guide audio. This study investigates how the voice characteristics of tour guide audio play a persuasive role in tourist purchase decisions. Drawing on the stereotype content model, we identify two voice characteristics: perceived warmth and competence. We then extract them from tour guide audio using speech processing and deep learning techniques. Our results show that perceived warmth and competence are positively related to tourist purchase decisions. The contingency effects further indicate greater warmth perception for female tour guides and greater competence perception for male tour guides. In addition, drawing on the value co-creation paradigm, we imply that perceived warmth is more salient for nonprofessionals, such as scholars, cultural celebrities, and even tourists themselves, whereas perceived competence is more salient for professionals, such as tour guides. This study represents pioneering work in AI-based sonic analysis in the tourism context and offers practical implications for tour guides on how to design their online tour guide audio and enhance tourist purchase decisions.