Deep Semantic Parsing of Freehand Sketches With Homogeneous Transformation, Soft-Weighted Loss, and Staged Learning

Ying Zheng,Hongxun Yao,Xiaoshuai Sun

doi:10.1109/tmm.2020.3028466

Abstract

In this paper, we propose a novel deep framework for part-level semantic parsing of freehand sketches, which makes three main contributions that are experimentally shown to have substantial practical merit. First, we propose a homogeneous transformation method to address the problem of domain adaptation. For the task of sketch parsing, there is no available data of labeled freehand sketches that can be directly used for model training. An alternative solution is to learn from datasets of real image parsing, while the domain adaptation is an inevitable problem. Unlike existing methods that utilize the edge maps of real images to approximate freehand sketches, the proposed homogeneous transformation method transforms the data from domains of real images and freehand sketches into a homogeneous space to minimize the semantic gap. Second, we design a soft-weighted loss function as guidance for the training process, which gives attention to both the ambiguous label boundary and class imbalance. Third, we present a staged learning strategy to improve the parsing performance of the trained model, which takes advantage of the shared information and specific characteristic from different sketch categories. Extensive experimental results demonstrate the effectiveness of the above three methods. Specifically, to evaluate the generalization ability of our homogeneous transformation method, additional experiments for the task of sketch-based image retrieval are conducted on the QMUL FG-SBIR dataset. Finally, by integrating the proposed three methods into a unified framework of deep semantic sketch parsing (DeepSSP), we achieve the state-of-the-art on the public SketchParse dataset.

Full Text