Infrastructure along the highway refers to various facilities and equipment: bridges, culverts, traffic signs, guardrails, etc. New technologies such as artificial intelligence, big data, and the Internet of Things are driving the digital transformation of highway infrastructure towards the future goal of intelligent roads. Drones have emerged as a promising application area of intelligent technology in this field. They can help achieve fast and precise detection, classification, and localization of infrastructure along highways, which can significantly enhance efficiency and ease the burden on road management staff. As the infrastructure along the road is exposed to the outdoors for a long time, it is easily damaged and obscured by objects such as sand and rocks; on the other hand, based on the high resolution of the images taken by Unmanned Aerial Vehicles (UAVs), the variable shooting angles, complex backgrounds, and high percentage of small targets mean the direct use of existing target detection models cannot meet the requirements of practical applications in industry. In addition, there is a lack of large and comprehensive image datasets of infrastructure along highways from UAVs. Based on this, a multi-classification infrastructure detection model combining multi-scale feature fusion and an attention mechanism is proposed. In this paper, the backbone network of the CenterNet model is replaced with ResNet50, and the improved feature fusion part enables the model to generate fine-grained features to improve the detection of small targets; furthermore, the attention mechanism is added to make the network focus more on valuable regions with higher attention weights. As there is no publicly available dataset of infrastructure along highways captured by UAVs, we filter and manually annotate the laboratory-captured highway dataset to generate a highway infrastructure dataset. The experimental results show that the model has a mean Average Precision (mAP) of 86.7%, an improvement of 3.1 percentage points over the baseline model, and the new model performs significantly better than other detection models overall.