Abstract
Classification is an important machine learning technique that attracts growing interests in various manufacturing applications. Learning an accurate classifier generally requires a large-scale perfectly-labeled training dataset. However, such "golden" labels are not only expensive but also difficult to collect in practice. To facilitate accurate classification in the presence of noisy labels, we propose a novel hybrid method based on active learning and data cleaning. Specifically, we first train an initial classifier with noisily- labeled data. Based on its prediction outcomes, a set of most informative samples is queried for manual annotation. To effectively correct other incorrect labels, we further self-label the unqueried samples based on the true labels provided by human experts and the estimated labels predicted by the initial classifier. As demonstrated by the experimental results based on two industrial datasets, the proposed approach achieves superior accuracy over other conventional methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.