Cows’ posture change is the fatal influencing factor for accurate identification of individual cows. To achieve non-contact, high-precision detection and identification of individual cows in farm environment, a cow individual identification method by the fusion of RetinaFace and improved FaceNet was proposed. MobileNet-enhanced RetinaFace was applied to ameliorate the impact of output channel quantity and convolution kernel dynamics using depthwise convolution combined with pointwise convolution. Regression predictions of bovine facial features and keypoints were generated under varying distances, scales and sizes. FaceNet's core feature network was enhanced through MobileNet integration, and the loss function was jointly optimized with Cross Entropy Loss and Triplet Loss to achieve a quicker and more stable convergence curve. The distances between the generated embedding vectors of cow facial features were corresponding to the similarity between cow faces, enabling accurate matching. RetinaFace exhibited detection false negative rates of 2.67%, 0.66%, 2.67%, and 3.33% under conditions of occlusion, no occlusion, low light, and bright light for cow facial detection. For cow facial pattern detection, the false negative rates for black and white patterns, pure black and pure white were 1.33%, 6.00% and 8.00%, respectively. Regarding cow facial posture changes, the false negative rates for face upward, bowing down, profile, and normal posture were 1.33%, 1.33%, 4.00% and 0.66%, respectively. Improved FaceNet model achieved an accuray of 99.50% on training set and 83.60% on test set. In comparison to YOLOX, the recognition model presented in this research demonstrated increased accuracy in cow facial detection under occlusion, no occlusion and strong lighting conditions by 2.67%, 0.40%, and 0.40%, respectively. Moreover, the accuracy for patterns with pure black and pure white tones surpassed that of YOLOX by 1.06% and 5.71%, correspondingly. Additionally, the accuracy rates for face upward, bowing down, profile and normal posture were higher than YOLOX by 2.00%, 3.34%, 2.66% and 0.40%, respectively. The proposed model demonstrates the proficiency in accurately identifying individual cows in natural scenes.