To capture and analyze the motion state of patients in real time and improve the evaluation effect of sports injury, the research is based on image recognition in visual image capture technology. Firstly, multiscale attention mechanism was introduced into U-Net image segmentation model to improve the pre-processing of image recognition. Then, the image recognition model of convolutional neural network is optimized by gradient class weighted activation mapping. The combination of the two is applied to the sports injury image processing to verify the effect. The results show that the F1 score and Precision values of the improved segmentation model in the database reach 98.85% and 98.74%, respectively. The segmentation accuracy is obviously improved. The accuracy of the optimized image recognition method in the training set and the test set is about 96% and 98%, respectively. After the combination of the two methods, the processing accuracy of sports injury medical images is 97%, and the running time is within 4s. It has high accuracy and processing efficiency, providing a technical and methodological basis for sports injury rehabilitation training.