ESP students need to develop their multimodal literacy to become literate in today's professional spaces. For this purpose, ESP teachers should revisit pedagogy practices to best engage students in the navigation and construction of multimodal genres. As a case in point, we explore PechaKucha (PK) presentations. This multimodal genre consists of 20 slides, which are automatically advanced every 20 s. PK presentations entail a complex format that requires speakers to choose how to convey content, design suitable visuals, and engage audiences. The dataset for the study consists of 7 PK presentations delivered during a social event at an architecture conference. Adopting a multimodal discourse analysis lens, we analyse this set of PK presentations in terms of rhetorical structure and the way in which intersemiotic relations unfolded (synchronisation between speech and visuals and the modal density of slides). The analysis demonstrates that PK presentations entail an intricate multimodal composition consisting of three moves in which professional and personal narratives intertwine. The examination of intersemiotic relations reveals how speech and visuals interplay effectively to transmit meaning and engage the audience. The results of this study provide critical information to design a research-informed pedagogy to enhance ESP students' multimodal literacy.