AbstractEnglish academic presentation (EAP) is an indispensable skill set of academic communication for university students. With the rapid development of desktop virtual reality (DVR), its application in language learning is worth exploring. The present study aimed to examine whether there is an improvement and difference in students' EAP by learning from the DVR with an in‐video pedagogical agent (PA) or an out‐of‐video PA. Adopting a between‐subject experimental design, a total of 64 students were randomly assigned to one of two group conditions depending on whether the PA was inside or outside the lecture video embedded in DVR. Participants' EAP performance, attention allocation and behavioural patterns were measured and analysed. As hypothesized, t‐tests, repeated ANOVA and lag sequence analysis showed that the participants who learned from the DVR with an out‐of‐video PA showed better learning performance, less attention allocation on content and more frequent behavioural patterns than those with an in‐video PA. Overall, our findings suggest that in a VR educational environment of video lectures, instructors should consider using an out‐of‐video PA to increase their social presence and improve students' learning experience. Practitioner notesWhat is already known about this topic EAP is an indispensable skill set of academic communication for university students. PA is an effective social cue in video lectures to promote learning. VR has been widely applied in language learning. What this paper adds Reveals the relationship between the PA's positioning and the learners' EAP performance and deepens the understanding of the PA's positioning in video lectures of a DVR learning environment. Provides empirical analysis of natural eye‐tracking during the video learning in DVR scene and EAP data during the experimental condition. Students who learned from the DVR with an out‐of‐video PA showed better learning performance, less attention allocation on content and more frequent behavioural patterns than those with an in‐video PA. Implications for practice and/or policy Designers are encouraged to use DVR with an out‐of‐video PA to enhance students' social presence and learning experience.