Abstract
Object-class segmentation is a computer vision task which requires labeling each pixel of an image with the class of the object it belongs to. Deep convolutional neural networks (DNN) are able to learn and exploit local spatial correlations required for this task. They are, however, restricted by their small, fixed-sized filters, which limits their ability to learn long-range dependencies. Recurrent Neural Networks (RNN), on the other hand, do not suffer from this restriction. Their iterative interpretation allows them to model long-range dependencies by propagating activity. This property might be especially useful when labeling video sequences, where both spatial and temporal long-range dependencies occur. In this work, we propose novel RNN architectures for object-class segmentation. We investigate three ways to consider past and future context in the prediction process by comparing networks that process the frames one by one with networks that have access to the whole sequence. We evaluate our models on the challenging NYU Depth v2 dataset for object-class segmentation and obtain competitive results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.