• The design strategy of the two-stage model guarantees the premise of non-contact pig face recognition and increases the true positive rate of pig face recognition. • The transfer learning method is used in the pig face detector to improve the accuracy of pig face detection. • The triplet margin loss function for person re-identification is applied to pig face recognition to ensure the accuracy of the model while reducing the amount of model parameters. In recent years, as the scale of breeding farms has become increasingly larger, to improve animal welfare and increase farm output, an increasing number of farms have proposed the idea of precision feeding for individual animals. Therefore, how to accurately identify a single animal individually and provide a targeted breeding program for it has become the focus. We have designed and evaluated a lightweight pig face recognition model based on a deep convolutional neural network algorithm, which can achieve a high pig face recognition rate in complex environments. This is a two-stage convolutional neural network model. The first stage is responsible for pig face detection. Based on the EfficientDet-D0 model, we show an improved average precision for pig face detection from 90.7% to 99.1% by employing a dataset sampling technique. The second stage is responsible for pig face classification, using six classification models, including ResNet-18, ResNet-34, DenseNet-121, Inception-v3, AlexNet, and VGGNet-19, as the backbone and proposes an improved method based on the triplet margin loss function. To strengthen the network performance, the multitask learning method enables the network to effectively cluster the features of the feature extractor layer. Then, the k-nearest neighbor algorithm is used to replace the fully connected layer with a large number of parameters to classify the features. These improved models have a maximum classification accuracy of 96.8% for 28 pigs. The parameters of these improved models are reduced to 4.32% of the original at most. Finally, the two-stage model including EfficientDet-d0 and DenseNet 121 has a mean average precision value of 91.35% for face recognition of 28 pigs. Compared with the EfficientDet-d0 network trained by the one-stage method, the mean average precision value is improved by 28%. In addition, we reorganized the original dataset and performed 10-fold cross-validation, and the mAP value was 94.04%.