Wheat breeding heavily relies on the observation of various traits during the wheat growth process. Among all traits, wheat head density stands out as a particularly crucial characteristic. Despite the realization of high-throughput phenotypic data collection for wheat, the development of efficient and robust models for extracting traits from raw data remains a significant challenge. Numerous fully supervised target detection algorithms have been employed to address the wheat head detection problem. However, constrained by the exorbitant cost of dataset creation, especially the manual annotation cost, fully supervised target detection algorithms struggle to unleash their full potential. Semi-supervised training methods can leverage unlabeled data to enhance model performance, addressing the issue of insufficient labeled data. This paper introduces a one-stage anchor-based semi-supervised wheat head detector, named “Wheat Teacher”, which combines two semi-supervised methods, pseudo-labeling, and consistency regularization. Furthermore, two novel dynamic threshold components, Pseudo-label Dynamic Allocator and Loss Dynamic Threshold, are designed specifically for wheat head detection scenarios to allocate pseudo-labels and filter losses. We conducted detailed experiments on the largest wheat head public dataset, GWHD2021. Compared with various types of detectors, Wheat Teacher achieved a mAP0.5 of 92.8% with only 20% labeled data. This result surpassed the test outcomes of two fully supervised object detection models trained with 100% labeled data, and the difference with the other two fully supervised models trained with 100% labeled data was within 1%. Moreover, Wheat Teacher exhibits improvements of 2.1%, 3.6%, 5.1%, 37.7%, and 25.8% in mAP0.5 under different labeled data usage ratios of 20%, 10%, 5%, 2%, and 1%, respectively, validating the effectiveness of our semi-supervised approach. These experiments demonstrate the significant potential of Wheat Teacher in wheat head detection.