With the development of science and technology, people have higher and higher requirements for robots. The application of robots in industrial production is also increasing, and there are more applications in people’s lives. Therefore, robots must have a better ability to receive and process the external environment. Therefore, visual servo system appears. Pose estimation is a major problem in the current vision system. It has great application value in positioning and navigation, target tracking and recognition, virtual reality and motion estimation. Therefore, this paper put forward the research of robot arm pose estimation and control based on machine vision. This paper first analyzed the technology of machine vision, and then carried out experiments. The accuracy and stability of the two methods for robot arm pose estimation were compared. The experimental results showed that when the noise of Kalman’s centralized data fusion method was 1 pixel, the maximum error of the X-axis angle was only 0.55, and the average error was 0.02. In Kalman’s distributed data fusion method, the average error of X-axis displacement was 0.06, and the maximum value was 17.66. In terms of accuracy, Kalman’s centralized data fusion method was better. In terms of stability, Kalman’s centralized data fusion method was also better. However, in general, these two methods had very good results, and could accurately control the position and posture of the manipulator.