Due to the increasing complexity of modern semiconductor integrated circuit fabrication processes, various defects may occur at each process step, leading to a semiconductor integrated circuit yield loss as well as reliability problems. This study introduces DeepSEM-Net, a novel dual-branch architecture that harmoniously integrates Convolutional Neural Networks (CNN) with Transformers. This innovative approach significantly enhances defect classification and precise segmentation in Scanning Electron Microscope (SEM) defect images, crucial for semiconductor manufacturing processes. DeepSEM-Net consists of a classification branch, which synergizes CNN and Transformers for robust global feature extraction, and a segmentation branch, which incorporates a Pyramid Attention module to meticulously recalibrate and exploit the feature map, thereby enriching the segmentation quality. A semi-supervised defect analysis system based on this model markedly reduces manual inspection efforts and is adept at operating in both simultaneous and sequential classification-segmentation modes. Rigorously tested on a diverse dataset from a real 12-inch wafer fab, DeepSEM-Net not only demonstrated a commendable classification accuracy of 97.25% on a 5-class dataset and a segmentation IoU of 84.40% but also exhibited remarkable resilience against the introduction of new defect classes in an unbalanced dataset. The dual-branch strategy of DeepSEM-Net, thus, significantly elevates the efficiency and reliability of SEM defect image analysis, offering a substantial leap forward in addressing yield loss and ensuring the reliability of semiconductor products.