Small targets exist in large numbers in various fields. They are broadly used in aerospace, video monitoring, and industrial detection. However, because of its tiny dimensions and modest resolution, the precision of small-target detection is low, and the erroneous detection rate is high. Therefore, based on YOLOv5, an improved small-target detection model is proposed. First, in order to improve the number of tiny targets detected while enhancing small-target detection performance, an additional detection head is added. Second, involution is used between the backbone and neck to increase the channel information of feature mapping. Third, the model introduces the BiFormer, wherein both the global and local feature information are captured simultaneously by means of its double-layer routing attention mechanism. Finally, a context augmentation module (CAM) is inserted into the neck in order to maximize the structure of feature fusion. In addition, in order to consider among the required real frame as well as the prediction frame simultaneously, YOLOv5’s original loss function is exchanged. The experimental results using the public dataset VisDrone2019 show that the proposed model has P increased by 13.43%, R increased by 11.28%, and mAP@.5 and mAP@[.5:.95] increased by 13.88% and 9.01%, respectively.