Abstract

Recent deep models advance the task of semantic visual parsing by increasing the depth of networks and the resolution (size) of the predicted labelmaps. However, the contextual information within each layer and between layers is not fully explored. Long Short Term Memory Networks(LSTM) that learn to propagate information is well-suited to model pixels dependencies with respect to spacial locations within layers and depths across layers. Unlike previous LSTM-based methods that tend to enhance representation of each pixel only by involving the information from adjacent area. This work proposes Progressively Diffused Networks (PDNs) to deal with complex semantic parsing tasks. It can explore spatial dependencies in a larger field that represents the rich contextual information among pixels. The proposed model has three appealing properties. First, it enables information to be progressively broadcast across feature maps by stacking multiple diffusion layers. Second, in each layer, multiple convolutional LSTMs are adopted to generate a series of feature maps with different ranges of contexts. Third, in each LSTM unit, a special type of atrous filters are designed to capture the short range and long range dependencies from various neighbors. Extensive experiments demonstrate the effectiveness of PDNs to substantially improve the performances of existing LSTM-based models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.