Large-scale, high-quality dataset was the foundation of developing advanced artificial intelligence applications. However, creating such a benchmark dataset in a professional field, such as precision management of animals, was always a challenge because of the costly and labor-intensive process of annotation and review. This study introduced a novel workflow named Accelerated Data Engine (ADE), designed to efficiently produce representative and high-quality computer vision datasets from raw animal surveillance footage. By incorporating referring and grounding models (R&G models) as auto-annotators, along with a distillation mechanism for dataset-auditors, ADE significantly speeded up the dataset construction process. The new workflow received natural language inputs as referrals to identify animal instances, delineated their body shapes, and then refined the auto-annotated data through a selection process. To demonstrate the efficacy of ADE, three 30-minute surveillance video samples featuring pigs, sheep, and cattle were discussed in this study. The results indicated the R&G models effectively annotated animals across various farms, while distillation mechanisms could identify various detection errors, balance the data representations, refine annotations, and verify the data quality. Two high-quality cattle datasets (6.5 k and 486 frames), including 26 k and 2.5 k cattle instances, were generated through the ADE workflow from 24-hour surveillance videos on a commercial cattle farm and made publicly available. The proposed dataset has achievable performance between 74.6 %∼84.1 %. The ADE workflow saved 78.4 % of manual work compared to the traditional dataset construction workflow (approximately 141 h). This pioneering approach empowered the fast creation of benchmark animal datasets and would enhance computer vision applications in the livestock production industry in the future.