Abstract Leveraging computer vision (CV) methods to study pen-fed cattle feeding behavior presents several advantages, including monitoring animal health and identifying feed efficient animals. This research employed a region-based convolutional neural network (RCNN) combined with the common objects in context (COCO) dataset for automatic livestock recognition using CV techniques in experimental feedlot pens. Thirty Angus-influenced steers were allocated in one pen with four automated feed intake systems (AFIS; Vytelle SENSE). CV data were recorded during daylight hours using a webcam (Microsoft LifeCam Cinema) with a resolution of 1280x720 pixels at ten frames per second connected to a video surveillance camera software (Contaware, Switzerland). The CV dataset obtained by the cameras, Figure 1, was benchmarked with feeding behavior data obtained from observed annotations (OA) and AFIS. The CV model utilized was a Mask-RCNN with pre-trained weights on the COCO dataset. The Mask RCNN algorithm can identify and locate multiple objects, objects of different scales, and overlapping objects within an image. The COCO dataset is a large and diverse set of annotated images that can be used for training and evaluating object detection models. The fully trained Mask-RCNN model takes each video frame as input and outputs the pixel segmentation of each object detected. For the CV analyses, each animal bounding box was associated with a score threshold, and each pixel within was assigned a probability threshold; both ranging from 0 to 100. The higher the score and probability thresholds are, the fewer the boxes and pixels will be marked as the animal boxing shape, resulting in less feeding detection from the CV system. In contrast, a low score and probability thresholds produce many predictions, including false detections, which might be misleading. Therefore, a dataset analysis was conducted to optimize the best threshold combination. We evaluated the accuracy and precision of the CV model compared with OA and AFIS by analyzing a dataset section of feeding events of one feed bunk using a 0.5 score and 0.5 probability thresholds. We obtained the values for the true positive (TP), false positive (FP), true negative (TN), and false negative (FN) to calculate the accuracy = [TP+TN]/[TP+TN+FP+FN], and precision = [TP]/[TP+FP], Table 1. As a result, the CV model achieved 95.39% and 99.82% accuracy and 93.60% and 99.90% precision when compared with OA and AFIS, respectively. Our findings showcase the promising capabilities of employing the Mask-RCNN algorithm and COCO datasets for detecting beef cattle feeding behavior in feedlot pens. This approach holds significant value for the cattle industry by enabling precise monitoring and analysis of individual animal behavior within feedlot pens, facilitating early detection of behavioral abnormalities.