Abstract

Combining the current development situation of data mining system, this paper introduces the centralized and distributed data mining system respectively, focuses on the detailed introduce to the various components of the centralized data mining system and its concrete realization technology, and summarizes the current development situation of the centralized and distributed data mining system at the same time respectively, then puts forward the research direction and development trend of the data mining system: to enhance the visualization and interaction, to improve the scalability, , follow the single standard combined with a specific industry application and support of data mining of the mobile environment. The development of data mining system has been made into a brief summary and outlook. Data mining, also known as knowledge discovery in database (KDD), is that the process of digging out interesting knowledge from a large amount of data stored in the database, data warehouse or other information. In recent years, to promote the actual application of data mining, many researchers do a lot of research work on the architecture of data mining system. A data mining system with reasonable structure should have the following features: 1) completeness of system function and auxiliary tools; 2) scalability of system; 3) support for multiple data sources; 4) processing capabilities of the large amount of data; 5) good user interface and showing ability of the results. In the current, the data mining system mainly includes centralized and the distributed data mining system, and the specific structure and its various components of each system have a variety of different implementation techniques and methods. Centralized data mining system The single database/data mining system is the current data mining application system that has a more mature development, and many commercial data mining application software are based on the structure. From the analysis of the current main data mining system, it can be found that the specific implementation techniques of different products to various functional modules are not the same. User interface and knowledge presentation layer. In this layer, showing mining results by providing a friendly user interface and using the data visualization technologies can greatly improve the usability of the system. The visualization of data mining is to use visualization technology to find out implicit and useful knowledge from a large number of data set. The visualization of data mining mainly includes the visualization of data, mining process and mining model. The current visualization techniques mainly include the traditional geometry method (such as graph, histogram, scatter plot, pie charts, etc.) SOM network visualization technology, parallel coordinates technique, the visualization technology facing the pixels, etc. The visualization technology based on the SOM network and the parallel coordinate are the two that have more applications, and the principle of them are to display data in the two-dimensional plane through high-dimensional data mapping for two-dimensional data. Such as a visual mining system based on the SOM network VISMiner designed by Wang Jiacai and others, and Liu Kan and others studied the specific application of parallel coordinates technique in data mining system. Control layer. Control layer is used to control the execution flow of the system, and coordinate the relationship between each feature and their execution order, which mainly includes the analysis of data mining task, and judgment of data involved in mining tasks and data mining algorithm

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.