This study aims to build a photovoltaic (PV) plant maintenance and operation system, using an unmanned aerial vehicle (UAV) carrying a thermal imager to take images. In the proposed system, the infrared (IR) image was used for detecting PV module thermal defects, and the RGB image was used for detecting module surface defects. The two images were employed to cross validate the causes for module defects.In Part I, the PV plant information pattern was created, and the Taiwan PV plant (1,482 PV modules, 410 kW) was taken as an example. The PV system image feature points were detected by using the Scale Invariant Feature Transform (SIFT), in order to solve the feature variation problems, such as image luminance, rotation, and zoom in/out. The same feature points of multiple local power plant images were matched. Afterwards, the optimal number of feature points was calculated by homography transformation and random sample consensus (RANSAC) to form the PV plant panorama by image stitching. The PV plant panorama background noise was removed by image hue. The module segmentation of PV systems was performed by using image luminance, and the PV module was geometrically reconstructed by using morphology. The PV module edge contour was extracted by the Laplace operator to obtain the perimeter, area, and centroid features. The quantity and positions of PV modules were recognized and calculated to form the PV plant information pattern. In Part II, the PV module defect recognition and classification system was built. The defects in seven PV plants in Taiwan were collected, and the image features were enhanced using a convolutional neural network (CNN). Besides using the convolution layer to capture the image features, Max Pooling and local response normalization were used to enhance the image features. Color space transform was used to intensify the color features, increase the accuracy of the classification modules, and recognize and position the PV module defects. The IR image hot spot recognition accuracy was 100%. The classification accuracy of eight modules, including one normal module and seven defect modules, is 97.52%. The classification accuracy of six modules, including the appearances of one normal module and five defects in RGB images, is 99.17%. The classification accuracy of 14 defects in IR thermal images and RGB images is 97.52%. The causes of defects were cross validated by IR thermal image and RGB image. This study applied the K-fold cross validation to select the optimal model, and the recognition time of one image was shorter than 0.3 s, which is lower than the camera time constant. The results show that the system is applicable to real-time detections. In Part III, the PV plant defect information pattern was created. The PV module with defects was labeled during detection, and the defects in the power plant PV module and the positions thereof were obtained, which would be favorable for PV plant maintenance.