Abstract

Data mining is becoming increasingly important since the size of databascs grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently database systems arc dominated by relational database and the ability to perform data mining using standard SQL queries will definitely case implementation of data mining. However the performance of SQL based data mining is known to fall behind specialized implementation and expensive mining tools being on sale. In this paper we present an evaluation of SQL based data mining on commercial RDBMS (IBM DB2 UDB EEE). We examine some techniques to reduce I/O cost by using View and Subquery. Those queries can be more than 6 times faster than SETM SQL query reported previously. In addition, we have made performance evaluation on parallel database environment and compared the performance result with commercial data mining tool (IBM Intelligent Miner). We prove that SQL based data mining can achieve sufficient performance by the utilization of SQL query customization and database tuning. Keywords: data mining, parallel SQL, query optimization, commercial RDBMS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.