Abstract

Data mining is a process of analyzing data to extract the patterns in large datasets in the field of artificial intelligence, machine learning and statistics. Decision tree is one of the well-established classification models in data mining. The size and dimensionality of the data of today’s world are increasing exponentially, thus finding of informative patterns is an important and crucial task. The organizations require distributed systems for storing and processing huge amount of data. The proposed method is the parallel implementation of Decision Tree methods based on the idea of attribute partitioning, where dataset is partitioned into multiple subsets of dimensions. We develop a Map-Reduce programming model for processing data using decision tree classifier based on attribute partitioning. The experimental results show an improvement in classification accuracy as compared to a traditional decision tree method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.