Facing the demand for accurate, fast, and natural human–computer interaction in the multi-screen control environment, the traditional exchange method of frequent switching control of multi-screen through mouse, keyboard, and other manual methods has problems of low efficiency, weak flexibility, and poor experience. In this paper, we research multi-screen precision control technology based on eye movement to break through the difficulties of precise recognition of human vision under complex environments and poor stability of vision estimation under frequent switching control of multi-screen. This study explores the technology for controlling multiple screens with high precision by tracking eye movements. It overcomes the challenges associated with reliably discerning human visual focus in intricate settings and the instability of visual assessments during rapid transitions between multiple screens. The experimental results of 1143 eye-movement capture tasks are studied, including the accurate recognition technology of human head posture and pupil gaze direction under complex backgrounds, illumination, and different visual field environments, and the theoretical modeling method of multi-screen precise target interaction based on eye-movement is explored. The expression data will be processed using convolutional neural networks, and the dataset will be trained through Python programming and tested on test samples collected by simulated flight crews in the laboratory with an accuracy rate of more than 85%.