Abstract

Data reduction is used to obtain the reduced representation of the data set, which is smaller than the original data, but still maintains the integrity of the original data approximately. Mining on the reduced data set will be more effective and produce the same or almost the same analysis results. A continuous multivariate coupled distribution estimation algorithm with arbitrary distribution is proposed. The distribution is estimated from samples by empirical distribution function, and new individuals are generated by sampling. Secondly, the idea of clustering is introduced into data reduction, and a time dimension reduction method based on clustering is formed. The basic idea of this method is to cluster the time dimension of time series data. In order to verify the feasibility of the two new methods proposed in this paper, a set of simulation experiments are designed in this paper, and representative data are used for data reduction respectively. Experiments show that the two data reduction methods proposed in this paper can not only effectively reduce the amount of data and achieve the purpose of data reduction, but also improve the classification accuracy and have strong practicability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.