Abstract

Now, deep learning technology has gradually matured and successfully applied in various fields, bringing great convenience to human life. However, people neglect the importance of early sample collection and data processing while continuously improving the quality of network models, which often leads to poor application effect of models in practical projects. At present, the Data-centric AI campaign has started in the field of deep learning, with the purpose of letting researchers pay attention to the quality of data. This paper proposes a sample selecting method based on feature density from the motion idea, which can be applied to practical application scenarios where there is a huge deviation in the distribution of train set and test set, the dataset redundancy is too high, and the dataset sampling is guided. This method uses ternary loss function to constrain and aggregate sample feature points, constructs feature density space through feature extractor, traverses feature points to calculate distance between samples, judges special category redundancy and deletes redundant samples, and finally retains samples as highly representative samples to optimize dataset distribution. In the experiment, the self-built dataset IP05 are used in this paper. Sufficient experimental results show that the samples selected by the feature density sample selecting method are indeed more representative.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call