Abstract

Data mining is an data intelligent analysis technology in the late 20th century, it can automatically extract or find useful model knowledge from large amounts of data in databases, data warehouses or other databases. In this process, the classification of data is an important research topic in the field of data mining. Currently there are different methods for classification, the classification algorithm of decision tree is clear, easy to understand and easy to convert into certain classification rules, so this classification algorithm is widely studied and applied. Based on the background of “data platform for public petition”, it aims to study how data mining system combined with the existing database, extracting useful information from the mass characteristics hidden in the data, and provide comprehensive analysis for system managers and decision makers. This paper focus on the study of basic principle of data mining and basic algorithms. The classification of the cases, this module was developed based on decision tree algorithm. Based on improved ID3 decision tree algorithm, according to the case information of the library and the client information of the other library, decision tree model can be built, to give certain case an assessment of the comprehensive analysis. This paper presents a simplified algorithm of entropy right based on the ID3 algorithm. The main idea of this algorithm is to combine the principle of Taylor formula with the attribute selection of the ID3 algorithm—entropy solution firstly, to simplify the entropy solution of the ID3 algorithm, to change the standard of attribute selection of the ID3 algorithm, to reduce the calculation complex degree of the algorithm, and to improve the algorithm running efficiency; And then give simplified entropy of every attribute a right N, this N is depend to every number of the attribute's value, to balance uncertainty of each attribute on the data set. It can make the attribute selection become more reasonable, and avoid compatibility with real attribute.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call