AbstractThe cinematic metaverse aims to create a virtual space with the context of a film. Users can enter this space in the form of avatars, experiencing the cinematic plot firsthand in an immersive manner. This requires us to design a rational computation resource allocation and synchronization algorithm to meet the demands of multi‐objective joint optimization, such as low latency and high throughput, which ensures that users can seamlessly switch between virtual and real worlds and acquire immersive experiences. Unfortunately, the explosive growth in the number of users makes it difficult to jointly optimize multiple objectives. Predicting traffic generated by the users' avatars in the cinematic metaverse is significant for the optimization process. Although graph neural networks‐based traffic prediction models achieve superior prediction accuracy, these methods rely only on physical distances‐based topological graph information, while failing to comprehensively reflect the real relationships between avatars in the cinematic metaverse. To address this issue, we present a novel Multi‐Graph Representation Spatio‐Temporal Attention Networks (MGRSTANet) for traffic prediction in the cinematic metaverse. Specifically, based on multiple topological graph information (e.g., physical distances, centerity, and similarity), we first design Multi‐Graph Embedding (MGE) module to generate multiple graph representations, thus reflecting on the real relationships between avatars more comprehensively. The Spatio‐Temporal Attention (STAtt) module is then proposed to extract spatio‐temporal correlations in each graph representations, thus improving prediction accuracy. We conduct simulation experiments to evaluate the effectiveness of MGRSTANet. The experimental results demonstrate that our proposed model outperforms the state‐of‐the‐art baselines in terms of prediction accuracy, making it appropriate for traffic forecasting in the cinematic metaverse.