Abstract
Unsupervised segmentation is an essential pre-processing technique in many computer vision tasks. However, current unsupervised segmentation techniques are sensitive to the parameters such as the segmentation numbers or of high training and inference complexity. Encouraged by neural networks’ flexibility and their ability for modelling intricate patterns, an unsupervised segmentation framework based on a novel deep image clustering (DIC) model is proposed. The DIC consists of a feature transformation subnetwork (FTS) and a trainable deep clustering subnetwork (DCS) for unsupervised image clustering. FTS is built on a simple and capable network architecture. DCS can assign pixels with different cluster numbers by updating cluster associations and cluster centers iteratively. Moreover, a superpixel guided iterative refinement loss is designed to optimize the DIC parameters in an overfitting manner. Extensive experiments have been conducted on the Berkley Segmentation Database. The experimental results show that DCS is more effective in aggregating features during the clustering procedure. DIC has also proven to be less sensitive to varying segmentation parameters and of lower computation costs, and DIC can achieve significantly better segmentation performance compared to the state-of-the-art techniques. The source code is available on https://github.com/zmbhou/DIC .
Highlights
Object segmentation is a challenging problem in the field of computer vision and it has been widely applied in areas such as object recognition and image classification
We propose a deep image clustering (DIC) model which consists of a feature transformation subnetwork (FTS) and a differentiable deep clustering subnetwork (DCS) for dividing the image space into different clusters
9: Generate the associations H q based on Y t and q−1 according to Eq (2). 10: Update cluster center q according to Eq (3). 11: end for 12: Generate Υt based on H and according to Eq (4). 13: Generate cluster t according to Eq (5). 14: Generate the refined clustert according to Eq (7). 15: Taket as the supervision and the network parameters t ft are optimized using the iterative refinement loss defined in Eq (8). 16: end for 17: T is taken as the final cluster assignments for image I
Summary
Object segmentation is a challenging problem in the field of computer vision and it has been widely applied in areas such as object recognition and image classification. In MLSS [7], a semi-supervised learning strategy is applied to generate pairwise affinities based on the sparse graph constructed on pixels and over-segmented regions. Superpixels are always taken as important cues for aiding segmentation and one of the typical works is MLSS [7], in which a multi-layer semi-supervised learning scheme is proposed to construct a dense affinity matrix over pixels and superpixels for spectral clustering. Another highlighted work is SAS [8], a novel segmentation framework based on bipartite graph partitioning to is designed to aggregate multi-layer superpixels. It is further fine-tuned by post-processing techniques such as leakage avoidance, fake boundary removal, and small region mergence to generate robust segmentation masks
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.