This paper proposes and evaluates, for the first time, a top-down (dorsal view), depth-only deep learning system for accurately identifying individual cattle and provides associated code, datasets, and training weights for immediate reproducibility. An increase in herd size skews the cow-to-human ratio at the farm and makes the manual monitoring of individuals more challenging. Therefore, real-time cattle identification is essential for the farms and a crucial step towards precision livestock farming. Underpinned by our previous work, this paper introduces a deep-metric learning method for cattle identification using depth data as a novel biometric measure acquired with an off-the-shelf 3D camera. In contrast to our previous work, which was limited to breeds with distinct coat patterns, this study introduces a breed-agnostic pipeline for universal cattle identification. The results show that depth, as a biometric, can potentially broaden the real-world applicability of our method to the rest 68% of UK cattle breeds that lack distinctive coat patterns. The method relies on Convolutional Neural Network (CNN) and Multi-Layered Perceptron (MLP) backbones that learn well-generalised embedding spaces from the body shape to differentiate individuals — requiring neither species-specific coat patterns nor close-up muzzle prints for operation. The network embeddings are clustered using a simple algorithm such as k-Nearest Neighbours (k-NN) for highly accurate identification, thus eliminating the need to retrain the network for enrolling new individuals. We evaluate two backbone architectures, Residual Neural Network (ResNet), as previously used to identify Holstein Friesians using RGB images, and PointNet, which is specialised to operate on 3D point clouds. We also present CowDepth2023, a new dataset containing 21,490 synchronised colour-depth image pairs of 99 cows, to evaluate the backbones. Both ResNet and PointNet architectures, which consume depth maps and point clouds, respectively, led to high accuracy that is on par with the coat pattern-based backbone. This new universal methodology also addresses the case of all-black and all-white breeds, where the previous coat pattern-based approach fell short. The ResNet colour backbone resulted in 99.97% k-NN identification accuracy, while the PointNet accuracy was 99.36%. Furthermore, we also show that the PointNet architecture is robust to noise and missing data by significantly reducing the number of 3D points and observing the drop in accuracy. Our research indicates that these techniques can identify animals using dorsal-view depth maps alone. Regardless of the substantial inter-class variety in the body shape, we show that the models spatially rely on similar body surfaces using Gradient-weighted Class Activation Mapping (Grad-CAM) and Point Cloud Saliency Mapping (PC-SM).
Read full abstract