Timely and precise farm inspection, which involves the identification and recognition of harmful insects and diseases, is crucial for safeguarding crop production. Traditional vision-based pest recognition methods typically require extensive annotated data for each pest species and a lengthy training process. This approach is time-consuming, labor-intensive, and prone to human error. Zero-shot learning offers a potential solution by enabling pest segmentation and control without requiring explicit training data. This study supports farmers in automatically identifying ten common pests and their precise locations in real-world outdoor environments. The zero-shot pest segmentation is based on a hybrid approach combining Explainable Contrastive Language-Image Pre-training (ECLIP) and Segment-Anything (SAM). Moreover, an optimized super-resolution model and various data augmentation methods are implemented to improve the quality of the dataset. Lastly, a mask post-processing step is applied to remove highly overlapping segmented masks and noise blobs caused by the complex background. The mean Intersection over Union (mIoU) of 66.5 % on the validation set demonstrates the potential of zero-shot methods for automated pest segmentation during farm inspections. This research lays the foundation for accurate pest monitoring systems capable of adapting to new pests, ultimately improving agricultural productivity.