Abstract
Convolutional Neural Networks (CNN) have brought spectacular improvements in several fields of machine vision including object, scene and face recognition. Nonetheless, the impact of this new paradigm on the classification of fine-grained images—such as colour textures—is still controversial. In this work, we evaluate the effectiveness of traditional, hand-crafted descriptors against off-the-shelf CNN-based features for the classification of different types of colour textures under a range of imaging conditions. The study covers 68 image descriptors (35 hand-crafted and 33 CNN-based) and 46 compilations of 23 colour texture datasets divided into 10 experimental conditions. On average, the results indicate a marked superiority of deep networks, particularly with non-stationary textures and in the presence of multiple changes in the acquisition conditions. By contrast, hand-crafted descriptors were better at discriminating stationary textures under steady imaging conditions and proved more robust than CNN-based features to image rotation.
Highlights
Colour texture analysis and classification play a pivotal role in many computer-vision applications such as surface inspection, remote sensing, medical image analysis, object recognition, content-based image retrieval and many others
ResNet outperformed by far the other networks, and interestingly, the FC configuration emerged as the best strategy to extract Convolutional Neural Networks (CNN)-based features among the three considered (FC, Bag of Visual Words (BoVW) and Vectors of Locally-Aggregated Descriptors (VLAD))
The results indicate that the best performing hand-crafted descriptors (e.g., Opponent Colour Local Binary Patterns (OCLBP), LCVBP and Integrative Co-occurrence Matrices (ICM)) were generally slower than the CNN-based methods in the feature extraction step; in the classification step, the situation was inverted in favour of the hand-crafted descriptors due to the lower dimensionality of these methods
Summary
Colour texture analysis and classification play a pivotal role in many computer-vision applications such as surface inspection, remote sensing, medical image analysis, object recognition, content-based image retrieval and many others. Sci. 2019, 9, 738 mappings are amenable to being transferred from one domain to another, making networks trained on certain classes of images usable in completely different contexts [5,9] The consequences of this are far-reaching: datasets large enough to train a CNN entirely from scratch are rarely available in practical tasks, pre-trained networks can in principle be used as off-the-shelf feature extractors in a wide range of applications. Some recent results seem to point in that direction, but it is precisely the aim of this work to investigate this matter further To this end, we comparatively evaluated the performance of a large number of classic and more recent hand-designed, local image descriptors against a selection of off-the-shelf features from last-generation CNN.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.