The highlights on the surface of metal materials can seriously destroy the continuity of the image, produce certain false edges, and cause the texture details in the highlights area to weaken or even disappear, which interferes with the subsequent operations such as surface region segmentation and defect detection. Aiming at the low efficiency, high loss, easy distortion and difficult calibration of metal highlight data, an unsupervised perceptual enhancement network model based on the convolutional neural network (CNN) is proposed. Firstly, the method of generating antagonism is used to generate a large number of metal images with high-light feature information, which is used to increase the number of high-light metal image data sets in the training set. Secondly, a detail enhance model(DEM) and a color enhance model(CEM) are introduced into the context aggregation network to improve the feature detail retention rate in the low resolution weight graphs. Finally, the multi-scale structural similarity function is used to replace the original structural similarity function to solve the insensitive detail when the image size is too large. Experiments show that comparing with other multi-exposure image fusion models, the present model can improve the evaluation index of mutual information and average gradient of fused images by about 10%, and can retain more texture feature information.