In recent years, driven by advancements in the photovoltaic industry, solar power generation has emerged as a crucial energy source in China and the globe. A progressive annotation approach is employed to pinpoint and label defect samples to enhance the precision of automated detection technology for minor defects within photovoltaic modules. Subsequently, computer vision techniques are harnessed to segment photovoltaic modules and defect samples amidst intricate backgrounds accurately. Finally, a transfer learning training model is deployed to classify and identify defects effectively. The results indicate that the mask-region convolutional neural network model achieves remarkable accuracy and recall rates of 98.7% and 0.913, respectively. Furthermore, the detection speed and inference time are 280.69 frames per second and 3.53 ms, respectively. In essence, the defect detection and classification algorithm utilizing computer vision techniques significantly enhances the precision of automated detection technology in identifying minor defects within complex environments. This advancement holds profound practical significance in ensuring photovoltaic modules’ quality and operational reliability.