Abstract In recent years, the extensive use of personalized cartoon models in film and television entertainment, games, and other fields has made 3D animation capture drive technology an important research topic in the field of virtual reality. This paper focuses on research in the field of 3D animation capture technology, specifically from two perspectives: human movement and facial expression. The human body movement node data are collected using sensors, and the movement state is represented in three-dimensional space using the form of quaternion, and the Euler angle and rotation matrix are applied to realize the data conversion, respectively. Personalized models of human facial expression data are acquired and built using optical motion capture technology. Enter the data into the database, finish the dynamic 3D re-modeling after data segmentation, and implement 3D animation utilizing 3D motion capture driving technology in 3D animation. The facial fluency index based on the sensor in human body movement with optical capture is about 80, and the fluency index is higher than 100 in four frames of 105, 110, 127, and 128. The mean value of the animated movie designed based on 3D animation capture driving technology is higher than that of the control group in the four aspects of interactivity, interestingness, informativeness, and behavioral change, and the differences between the two groups are 3.1977, 1.899, 0.4378, 1.1444, and the mean value is less than 0.01, the animated movie designed based on 3D motion capture technology brings better viewing experience to the audience.