For the dynamic film and television special effects industry, creating visually stunning and valuable graphics requires the integration of excellent deep-learning algorithms. This paper presents an enhanced version of the 3-D Convolutional Neural Network (3-D CNN), specifically tailored to meet the demanding needs of multi-frame television and film special effects. The version’s key feature is its efficient data handling through coupled precision training, significantly increasing computational efficiency while reducing memory needs. This method, which combines floating-issue operations of 16 and 32 bits, is ideal for efficiently processing large amounts of high-resolution video data. The proposed 3-D CNN structure excels in extracting and analysing complex spatiotemporal capabilities from video sequences, capturing the spatial and temporal nuances crucial in film imagery. This capability is important for accomplishing excessive fidelity in visual results, ensuring seamless integration with live motion pics. Incorporating a cutting-edge attention mechanism inspired by the aid of transformer-based architectures, the model specialises inside the maximum pertinent components of video frames to enhance the quality and realism of outcomes. Furthermore, the model boasts an excessive-resolution processing characteristic, permitting simultaneous capture of first-rate records and broader scene context. This guarantees consistency and realism in outcomes, from complicated textures to the overarching visual narrative. Advanced regularisation strategies are hired to prevent overfitting, permitting the model to generalise efficiently through numerous film manufacturing conditions. The state-of-the-art 3-D information augmentation techniques make the model robust and prepare it to deal with an intensive variety of challenging situations involving computer special effects. Real-time processing competencies make the model a game-changer for on-set visible outcomes and adjustments. Designed for seamless integration with the famous CGI software program software utility, it allows a harmonious aggregate of AI and ingenious creativity. Acknowledging the environmental impact of high-powered computing, the model consists of power-efficient computational techniques that align with sustainable computing practices, lower operational costs, and are related to processing complex video statistics. model boasts an excessive-resolution processing characteristic, permitting simultaneous capture of first-rate records and broader scene context. This guarantees consistency and realism in outcomes, from complicated textures to the overarching visual narrative. Advanced regularisation strategies are hired to prevent overfitting, permitting the model to generalize efficiently through numerous film manufacturing conditions. The state-of-the-art 3-D information augmentation techniques, in addition, make the model robust, and prepare it to deal with an intensive variety of computer special effects challenging situations.