The axillary buds that grow between the main and lateral branches of tomato plants waste nutrients and lead to a decrease in yield, necessitating regular removal. Currently, these buds are removed manually, which requires substantial manpower and incurs high production costs, particularly on a large scale. Replacing manual labor with robots can lead to cost reduction. However, a critical challenge is the accurate multi-target identification of tomato plants and precise positioning for axillary bud removal. Therefore, this paper proposes a multi-target identification and localization method for tomato plants based on the VGG16-UNet model. The average intersection and pixel accuracies of the VGG16-UNet model after introducing the pretrained weights were 85.33% and 92.47%, respectively, which were 5.02% and 4.08% higher than those of the VGG16-UNet without pretrained weights, achieving the identification of main branches, side branches, and axillary bud regions. Then, based on the multi-objective segmentation of the tomato plants in the VGG16-UNet model, the regions of the axillary buds in the tomato plants were identified by HSV color space conversion and color threshold range selection. Morphological dilation and erosion operations were used to remove noise and connect adjacent regions of the same target. The endpoints and centroids of the axillary buds were identified using the feature point extraction algorithm. The left and right positions of the axillary buds were judged by the relationship between the position of the axillary bud centroid and the position of the main branch. Finally, the coordinate parameters of the axillary bud removal points were calculated using the feature points to determine the relationship between the position of the axillary bud and the position of the branch. Experimental results showed that the average accuracy of the axillary bud pruning point recognition was 85.5%.