Abstract
Labeling of data is often difficult, expensive, and time consuming since efforts of experienced human annotators are required, and often we have large number of samples and noisy data. Co-training is a practical and powerful semi-supervised learning method as it yields high classification accuracy with a training data set containing only a small set of labeled data. For successful co-training performance, two important conditions need to be satisfied for the features: diversity and sufficiency. In this paper, we propose a novel mutual information based approach inspired by the idea of dependent component analysis to achieve feature splits that are maximally independent between-subsets (diverse) or within-subsets (sufficient). In addition, we demonstrate the application of the method to a real world problem, classification of laser tread mapping tire data. We introduce several features that are designed to highlight physical characteristics of the tire data, as well as local or global descriptors, such as histograms, gradients, or representations in other domains. Results from both simulations and tire image classification confirm that co-training with the proposed feature set and feature splits consistently yields higher accuracy than supervised classification, when using only a small set of labeled training data is available. The proposed method presents a very promising complement to time consuming and subjective expert labeling of data, reducing expert efforts to a minimum. Further results show that by using a probabilistic multi-layer perceptron classifier as the base learner in co-training, our method leads to very meaningful continuous measures for the progression of irregular wear on tire surface.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.