Abstract

Because of the inability to take a multi-pass scanning algorithm for random access to fast data streams and traditional data mining algorithms can’t sample all samples of the data stream, research of data stream mining algorithm based on fuzzy decision tree theory that fuzzy decision tree combines the understandability of decision tree and the ability of representation of fuzzy set to deal with the fuzziness and uncertainty information is very valuable to improve the accuracy of data mining. This paper presents a fuzzy decision tree data mining strategy based on hybrid partitioning standard for the problem that the method has a low accuracy when we deal with low-membership samples with missing values by dividing the samples into leaf nodes according to their membership. The multivariate branch fuzzy decision tree data stream mining strategy based on hybrid partitioning standard(MHFlexDT) is used to construct the multivariate branch fuzzy tree structure. The data fitting problem is solved by adding temporary branches to the uncertain data. At the same time, the decision tree depth is effectively limited by using the McDiarmid bound threshold. The experimental results show that MHFlexDT strategy compared with fuzzy decision tree data mining strategy is more effective in large-scale data stream mining to reduce system computation, control decision tree depth, and ensure a high accuracy when we deal with missing values, data over-fitting and noisy data problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call