Internet of Things (IoT) applications and resources are highly vulnerable to flood attacks, including Distributed Denial of Service (DDoS) attacks. These attacks overwhelm the targeted device with numerous network packets, making its resources inaccessible to authorized users. Such attacks may comprise attack references, attack types, sub-categories, host information, malicious scripts, etc. These details assist security professionals in identifying weaknesses, tailoring defense measures, and responding rapidly to possible threats, thereby improving the overall security posture of IoT devices. Developing an intelligent Intrusion Detection System (IDS) is highly complex due to its numerous network features. This study presents an improved IDS for IoT security that employs multimodal big data representation and transfer learning. First, the Packet Capture (PCAP) files are crawled to retrieve the necessary attacks and bytes. Second, Spark-based big data optimization algorithms handle huge volumes of data. Second, a transfer learning approach such as word2vec retrieves semantically-based observed features. Third, an algorithm is developed to convert network bytes into images, and texture features are extracted by configuring an attention-based Residual Network (ResNet). Finally, the trained text and texture features are combined and used as multimodal features to classify various attacks. The proposed method is thoroughly evaluated on three widely used IoT-based datasets: CIC-IoT 2022, CIC-IoT 2023, and Edge-IIoT. The proposed method achieves excellent classification performance, with an accuracy of 98.2%. In addition, we present a game theory-based process to validate the proposed approach formally.