A Generalized Earley Parser for Human Activity Parsing and Prediction.

Siyuan Qi,Ping Wei,Baoxiong Jia,Siyuan Huang,Song-Chun Zhu

doi:10.1109/tpami.2020.2976971

Abstract

Detection, parsing, and future predictions on sequence data (e.g., videos) require the algorithms to capture non-Markovian and compositional properties of high-level semantics. Context-free grammars are natural choices to capture such properties, but traditional grammar parsers (e.g., Earley parser) only take symbolic sentences as inputs. In this paper, we generalize the Earley parser to parse sequence data which is neither segmented nor labeled. Given the output of an arbitrary probabilistic classifier, this generalized Earley parser finds the optimal segmentation and labels in the language defined by the input grammar. Based on the parsing results, it makes top-down future predictions. The proposed method is generic, principled, and widely applicable. Experiment results clearly show the benefit of our method for both human activity parsing and prediction on three video datasets.

Full Text