A surface defect detection method for hot-rolled steel strips was proposed to address the challenges of detecting small target defects, significant differences in morphology, and unclear defect characteristics. This method is based on multiscale feature perception and adaptive feature fusion. First, based on the spatial distribution characteristics of the steel strip image, redundant background interference is removed using automatic gamma correction and Otsu thresholding. Second, based on the characteristics of surface defects in steel strips, this paper proposes TDB-YOLO (YOLO with a small target detection layer), a Bidirectional Feature Pyramid Network (BiFPN), and Double Cross Stage Partial (CSP) Bottleneck with three convolutions (DC3). To detect small object defects, a small target detection layer with a smaller receptive field focuses on fine-grained features, reducing the model’s probability of missed detection. In terms of feature extraction, DC3 enhances the interaction of feature information from different spatial scales, enabling the model to effectively handle features of varying scales. In terms of feature fusion, the BiFPN is used to adaptively fuse deep-level and shallow-level feature information, enhancing the semantic richness of the feature information. Ultimately, the proposed model in this paper achieved an accuracy of 90.3% and a recall rate of 88.0% for surface defects in steel strips. The mean average precision was 90.4%, and the frames per second was 33. The detection performance of this model outperformed those of other detection models, demonstrating its ability to effectively meet the real-time detection requirements of surface defects in industrial scenarios on steel strips.