Animal behavior is a critical aspect for a better understanding and management of animal health and welfare. The combination of cameras with artificial intelligence holds significant potential, particularly as it eliminates the need to handle animals and allows for the simultaneous measurement of various traits, including activity, space utilization, and inter-individual distance. The primary challenge in using these techniques is dealing with the individualization of data, known as the multiple object tracking problem in computer science. In this article, we propose an original solution called “Puzzle.” Similar to solving a puzzle, where you start with the border pieces that are easy to position, our approach involves commencing with video sequences where tracking is straightforward. This initial phase aims to train a Convolutional Neural Network (CNN) capable of deriving the appearance clues of each animal. The CNN is then used on the entire video, together with distance-based metrics, in order to associate detections and animal id. We illustrated our method in the context of outdoor goat tracking, achieving a high percentage of good tracking, exceeding 90%. We discussed the impact of different criteria used for animal ID association, considering whether they are based solely on location, appearance, or a combination of both. Our findings indicate that, by adopting the puzzle paradigm and tailoring the appearance CNN to the specific video, relying solely on appearance can yield satisfactory results. Finally, we explored the influence of tracking efficacy on two behavioral studies, estimating space utilization and activity. The results demonstrated that the estimation error remained below 10%. The code is entirely open-source and extensively documented. Additionally, it is linked to a data-paper to facilitate the training of any automatic detection algorithm for goats, with the goal of fostering open access within the deep-learning livestock community.