Abstract

Instance segmentation is an important yet challenging task in the computer vision field. Existing mainstream single-stage solution with parameterized mask representation has designed the neck models to fuse features of different layers; however, the performance of instance segmentation is still restricted to the layer-by-layer transmission scheme. In this article, an instance segmentation framework with an adaptive long-neck (ALN) network and atrous-residual structure is proposed. The long-neck network is composed of two bidirectional fusion units, which are cascaded to facilitate the information communication among features of different layers in top-down and bottom-up pathways. In particular, a new cross-layer transmission scheme is introduced in a top-down pathway to achieve a hybrid dense fusion of multiscale features and weights of different features are learned adaptively according to their respective contributions to promote the network convergence. Meanwhile, a bottom-up pathway further complements the features with more location clues. In this way, high-level semantic information and low-level location information are tightly integrated. Furthermore, an atrous-residual structure is added to the mask prototype branch of instance prediction to capture more contextual information. This contributes to the generation of high-quality masks. The experimental results indicate that the proposed method achieves effective segmentation and the outputted masks match the contours of objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call