Cracks are an earlier indication of severe structural damage, and are also an important indicator in the evaluation of structural health and monitoring processes. However, the complex background interference makes the segmentation of small cracks are an extremely challenging task. Therefore, a Dual Encoder Crack Segmentation Network (DECS-Net) based on convolutional neural network (CNN) and transformer is constructed to achieve automated crack detection. Firstly, a High–Low frequency Attention (HLA) mechanism is proposed that uses the Haar wavelet to obtain the approximate and detailed components, and further processes to obtain low-frequency and high-frequency features. In addition, a Locally Enhanced Feedforward Network (LEFN) is designed to help the network improve its local information perception ability. Secondly, a Features Fusion Module (FFM) is proposed to fuse the local features extracted by the CNN encoder and the global contextual features extracted by the transformer encoder, implemented cross-domain fusion and correlation enhancement. Finally, experiments are conducted on two public datasets DeepCrack and Crack3238, and the recall showed a remarkable improvement, with 92.70% on DeepCrack and 79.02% on Crack3238. Its comprehensive performance is ahead of the ten state-of-the-art models.