Robot arm monitoring is often required in intelligent industrial scenarios. A two-stage method for robot arm attitude estimation based on multi-view images is proposed. In the first stage, a super-resolution keypoint detection network (SRKDNet) is proposed. The SRKDNet incorporates a subpixel convolution module in the backbone neural network, which can output high-resolution heatmaps for keypoint detection without significantly increasing the computational resource consumption. Efficient virtual and real sampling and SRKDNet training methods are put forward. The SRKDNet is trained with generated virtual data and fine-tuned with real sample data. This method decreases the time and manpower consumed in collecting data in real scenarios and achieves a better generalization effect on real data. A coarse-to-fine dual-SRKDNet detection mechanism is proposed and verified. Full-view and close-up dual SRKDNets are executed to first detect the keypoints and then refine the results. The keypoint detection accuracy, PCK@0.15, for the real robot arm reaches up to 96.07%. In the second stage, an equation system, involving the camera imaging model, the robot arm kinematic model and keypoints with different confidence values, is established to solve the unknown rotation angles of the joints. The proposed confidence-based keypoint screening scheme makes full use of the information redundancy of multi-view images to ensure attitude estimation accuracy. Experiments on a real UR10 robot arm under three views demonstrate that the average estimation error of the joint angles is 0.53 degrees, which is superior to that achieved with the comparison methods.
Read full abstract