Enhancing scene parsing by transferring structures via efficient low-rank graph matching

Tianshu Yu,Ruisheng Wang

doi:10.1145/2996913.2996956

Abstract

Scene parsing has attracted significant attention for its practical and theoretical value in computer vision. A typical scene parsing algorithm seeks to densely label pixels or 3-dimensional points from a scene. Traditionally, this procedure relies on a pre-trained classifier to identify the label information, and a smoothing step via Markov Random Field to enhance the consistency. LabelTranfer is a category of scene parsing algorithms to enhance traditional scene parsing framework, by finding dense correspondence and transferring labels across scenes. In this paper, we present a novel scene parsing algorithm which matches maximal similar structures between scenes via efficient low-rank graph matching. The inputs of the algorithm are images, and well- aligned point clouds if available. The images and the point clouds are processed in separate pipelines. The pipeline of images is to learn a reliable classifier and to match local structures via graph matching. The pipeline of point clouds is to conduct preliminary segmentation and to generate feasible label sets. The two pipelines are merged at inference step, in which we elaborate effective and efficient potential functions. We propose a new graph matching model incorporating low-rank and Frobenius regularization, which not only guarantees an accurate solution, but also provides high optimization efficiency via an eigen-decomposition strategy. Several challenging experiments are conducted, showing competitive performance of the proposed method compared to state-of-the-art LabelTransfer algorithm. Further, with point clouds, the performance can be significantly enhanced.

Full Text