With the development of intelligent transportation systems, most human objects can be accurately detected in normal road scenes. However, the detection accuracy usually decreases sharply when the pedestrians are merged into the background with very similar colors or textures. In this paper, a camouflaged object detection method is proposed to detect the pedestrians or vehicles from the highly similar background. Specifically, we design a guide-learning-based multi-scale detection network (GLNet) to distinguish the weak semantic distinction between the pedestrian and its similar background, and output an accurate segmentation map to the autonomous driving system. The proposed GLNet mainly consists of a backbone network for basic feature extraction, a guide-learning module (GLM) to generate the principal prediction map, and a multi-scale feature enhancement module (MFEM) for prediction map refinement. Based on the guide learning and coarse-to-fine strategy, the final prediction map can be obtained with the proposed GLNet which precisely describes the position and contour information of the pedestrians or vehicles. Extensive experiments on four benchmark datasets, e.g., CHAMELEON, CAMO, COD10K, and NC4K, demonstrate the superiority of the proposed GLNet compared with several existing state-of-the-art methods.