This paper introduces a Multi-Object Tracking (MOT) framework for agricultural applications that estimates global positions in pixel coordinates using the local feature matching transformer — LoFTR. We design an efficient tracker that augments the capabilities of a state-of-the-art tracking algorithm by incorporating a novel association strategy based on spatial information of targets leaving and returning the camera field of view. We evaluate our framework using the publicly available LettuceMOT benchmark dataset and an adapted version of the AppleMOTS benchmark dataset that we denominate AppleMOT. Our experimental results demonstrate that our method outperforms cutting-edge algorithms for robotic plant tracking in the LettuceMOT dataset. The evaluation metrics show average improvements of up to 25% compared to the best publicly available results, demonstrating the benefits of our spatial association approach. For the AppleMOT dataset, we obtained bounding-box-based MOT evaluation metrics comparable to the segmentation-based (MOTS) counterparts presented in the original AppleMOTS paper. These findings highlight the effectiveness and potential of our approach in addressing the unique challenges posed by agricultural environments.