Abstract

Subject of study. This paper discusses how deep convolutional neural networks can be used to improve images obtained under noisy conditions when supplementary information concerning objects on the image is input in the form of segmentation masks. Several methods of using semantic information during the operation of the network are studied. First, by introducing a mask along with the image at the input of the network, and second, by creating a loss function by using a segmentation mask. Method. Several experimental series are carried out both with and without using various types of semantic information. The experiments were run with several noise intensities. The image-reconstruction quality as a whole is analyzed, along with the reconstruction quality of a region of images corresponding to an entire class. The class of road signs was chosen as the target class since it possesses less variability than many other classes, and this gives an advantage to the network when it is reconstructed in the presence of semantic information compared with in its absence. The COCO dataset with marked segmentation maps was used during the study. Test surroundings for reconstructing the images, with the possibility of visualizing the results of the testing, were developed to analyze the semantic properties of all the objects contained in the COCO set, and this made it possible to draw certain useful conclusions concerning how various properties of the objects influence the accuracy with which they are reconstructed. Main results. We show that, under very noisy conditions, a reconstruction network trained using supplementary information in the form of segmentation masks can do a better job of reconstructing objects that correspond to the masks (by 3.5%), while lowering the capability of the network to reconstruct the entire image is only insignificantly reduced (by 0.4%). However, no such quality increment is obtained for weak and medium noise. Practical significance. Our goal in this paper was not to create a finished algorithm and neural-network architecture, but only to study the possible properties of such algorithms; we therefore supplied semantic reference markers to the input of the neural networks. In developing such a method, a segmenter network that would automatically extract information from a noisy image can be added (the process itself can be iterative in this case—the image is improved after being segmented, and a refined segmentation mask is constructed from the improved image).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call