Abstract

Data imbalance is a crucial factor that limits the performance of automatic defect recognition systems in castings. The bias and deterioration of the model are generated by massive normal samples and minor defect samples. Traditional re-sampling methods randomly change the data distribution and ignore the significant intra-class difference among all normal samples. Therefore, this paper proposes a distribution-preserving under-sampling method for imbalance defect-recognition in castings. In detail, our method divides all normal samples into several sub-groups by cluster analysis and reassembles them into some balance datasets, which makes the normal samples in all balance datasets have an identical distribution with the original imbalance dataset. Finally, experiments on our dataset with 3260 images indicate that the proposed method achieves a 0.816 AUC (area under curve) score, which demonstrates significant advantages compared to cost-sensitive learning and re-sampling methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call