Although research on virtual-reality (VR) exposure therapy (VRET) in anxiety disorders has primarily focused on effectiveness and acceptability, the underlying working mechanisms have received scant attention. To fill this knowledge gap, we discuss potential theoretical underpinnings of VRET based on three dominant theoretical accounts on exposure: inhibitory-learning theory (expectancy violation), emotional-processing theory (habituation), and self-efficacy theory. Whereas theoretically speaking, habituation and self-efficacy seem plausible candidate mechanisms to explain the effects of VRET, the role of expectancy violation is less straightforward. Because of the simulated nature of VR, some feared outcomes cannot occur, and therefore, possibilities to violate expectancies about their occurrence may be compromised. Empirical evidence on the working mechanisms of VRET is scarce and has important limitations. Avenues for future research are provided. Insights into the mechanisms of VRET not only are of theoretical importance but also can provide theory-based directions to optimize the application of VRET.