As in-vehicle information systems (IVIS) grow increasingly complex, the demand for innovative artificial intelligence-based interaction methods that enhance cybersecurity becomes more crucial. In-air gestures offer a promising solution due to their intuitiveness and individual uniqueness, potentially improving security in human–computer interactions. However, the impact of in-air gestures on driver distraction during in-vehicle tasks and the scarcity of skeleton-based in-air gesture recognition methods in IVIS remain largely unexplored. To address these challenges, we developed a skeleton-based framework specifically tailored for IVIS that recognizes in-air gestures, classifying them as static or dynamic. Our gesture model, tested on the large-scale AUTSL dataset, demonstrates accuracy comparable to state-of-the-art methods and increased efficiency on mobile devices. In comparative experiments between in-air gestures and touch interactions within a driving simulation environment, we established an evaluation system to assess the driver’s attention level during driving. Our findings indicate that in-air gestures provide a more efficient and less distracting interaction solution for IVIS in multi-goal driving environments, significantly improving driving performance by 65%. The proposed framework can serve as a valuable tool for designing future in-air gesture-based interfaces for IVIS, contributing to enhanced cybersecurity.