Abstract

Scene parsing is essential for many high-level AI applications, such as intelligent vehicles and traffic surveillance. In this work, we propose a highly efficient and powerful deep convolutional neural network, namely Efficient Knowledge Enhanced Network (EKENet), for parsing scenes in real-time. Unlike most existing approaches that compromise efficiency for the sake of high accuracy, EKENet achieves an ideal trade-off between the two. Our EKENet is built upon a novel building block, namely Efficient Dual Abstraction (EDA) block, which employs an efficiently parallel convolution structure for extracting spatial features and modeling cross-channel correlations in a dual fashion. Additionally, a novel light-weight Encoding-Enhancing (EE) module is designed to enhance our EKENet, which can efficiently encode high-level knowledge extracted from top layers to guide the learning of low-level features from bottom layers.Extensive experiments on challenging benchmarks, Cityscapes and CamVid datasets, demonstrate that EKENet achieves the new state-of-the-art performance in terms of speed and accuracy tradeoff.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call