Object detection is a crucial component of the perception system in autonomous driving. However, the road scene presents a highly intricate environment where the visibility and characteristics of traffic targets are susceptible to attenuation and loss due to various complex road scenarios such as lighting conditions, weather conditions, time of day, background elements, and traffic density. Nevertheless, the current object detection network must exhibit more learning capabilities when detecting such targets. This also exacerbates the loss of features during the feature extraction and fusion process, significantly compromising the network's detection performance on traffic targets. This paper presents a novel methodology by which to overcome the concerns above, namely HRYNet. Firstly, a dual fusion gradual pyramid structure (DFGPN) is introduced, which employs a two-stage gradient fusion strategy to enhance the generation of more comprehensive multi-scale high-level semantic information, strengthen the interconnection between non-adjacent feature layers, and reduce the information gap that exists between them. HRYNet introduces an anti-interference feature extraction module, the residual multi-head self-attention mechanism (RMA). RMA enhances the target information by implementing a characteristic channel weighting policy, thereby reducing background interference and improving the attention capability of the network. Finally, the detection performance of HRYNet was evaluated by utilizing three datasets: the horizontally collected dataset BDD1000K, the UAV high-altitude dataset Visdrone, and a custom dataset. Experimental results demonstrate that HRYNet achieves a higher mAP_0.5 compared with YOLOv8s on the three datasets, with increases of 10.8%, 16.7%, and 5.5%, respectively. To optimize HRYNet for mobile devices, this study presents Lightweight HRYNet (LHRYNet), which effectively reduces the number of model parameters by 2 million. The results demonstrate that LHRYNet outperforms YOLOv8s in terms of mAP_0.5, with improvements of 6.7%, 10.9%, and 2.5% observed on the three datasets, respectively.